Automated web scraping can be problematic. Just look at Clearview, which has leveraged open access to public websites to create a facial recognition program it now sells to government agencies. But web scraping can also be quite useful for people who don’t have the power or funding government agencies and their private contractors have access to.
The problem is the Computer Fraud and Abuse Act (CFAA). The act was written to give the government a way to go after malicious hackers. But instead of being used to prosecute malicious hackers, the government (and private companies allowed to file CFAA lawsuits) has gone after security researchers, academics, public interest groups, and anyone else who accesses systems in ways their creators haven’t anticipated.
Fortunately, things have been changing in recent years. In May of last year, the DOJ changed its prosecution policies, stating that it would not go after researchers and others who engaged in “good faith” efforts to notify others of data breaches or otherwise provide useful services to internet users. Web scraping wasn’t specifically addressed in this policy change, but the alteration suggested the DOJ was no longer willing to waste resources punishing people for being useful.
Web scraping is more than a CFAA issue. It’s also a constitutional issue. None other than Clearview claimed it had a First Amendment right to gather pictures, data, and other info from websites with its automated scraping.
Unfortunately, all we really have is a pinkie swear from the DOJ and a handful of decisions that only have precedential weight in certain jurisdictions. But there’s more coming. As the ACLU reports, another federal court has come to the conclusion that government efforts banning web scraping violate the rights of would-be scrapers. But, as is the case in many legal actions, the details matter.
In an important victory, a federal judge in South Carolina ruled that a case to lift the categorical ban on automated data collection of online court records – known as “scraping” – can move forward. The case claims the ban violates the First Amendment.
The decision came in NAACP v. Kohn, a lawsuit filed by the American Civil Liberties Union, ACLU of South Carolina, and the NAACP on behalf of the South Carolina State Conference of the NAACP. The lawsuit asserts that the Court Administration’s blanket ban on scraping the Public Index – the state’s repository of court filings – violates the First Amendment by restricting access to, and use of, public information, and prohibiting recording public information in ways that enable subsequent speech and advocacy.
The case stems from the NAACP’s “Housing Navigator,” which scrapes publicly available info from government websites to find tenants subject to eviction in order to provide them assistance in fighting eviction orders or finding new housing. As the NAACP (and ACLU) point out, this valuable service would be impossible if the NAACP was limited to manual searches to find affected tenants.
The state of South Carolina — via a state appellate decisions — claims the NAACP is only allowed limited access — the manual searches the NAACP says render its eviction assistance efforts impossible to achieve. The federal court says the state does have the power to limit access to public records, but those limits must align themselves with the tenets of the First Amendment, which presume open access to government records by the governed.
The state comes down on the losing side here, at least for the moment. The limits proposed by the state court order nullify the services the NAACP hopes to offer. As it stands now, the state cannot escape this lawsuit because there’s enough on the record at the moment that suggests there’s a viable constitutional claim.
The NAACP alleges that without scraping, it is impossible to gather the information quickly enough to meet the ten-day deadline to request a hearing. It alleges that scraping poses at most a de minimis burden on the functionality of the website.
As discussed above, it also contends suggested alternatives to scraping, such as Rule 610, are insufficient, and that Defendants have, in any event, indicated an unwillingness to provide the information under that rule. […]
True, the evidence may eventually show that Defendants have a sufficient reason to prohibit scraping. It may indicate that the NAACP’s access to the records is unburdened by the restriction. Or, it may demonstrate that Defendants have provided sufficient alternatives to access the information. But, as alleged, the restrictions state a claim for violation of the First Amendment.
The bottom line is this: automated access to government records is almost certainly protected by the First Amendment. What will be argued going forward is how much the government can restrict this access without violating the Constitution. There’s not a lot on the record at the moment, but this early ruling seems to suggest this court will err on the side of unrestricted access, rather than give its blessing to unfettered fettering of the presumption of open access that guides citizens’ interactions with public records.
More Law-Related Stories From Techdirt:
Guy Who Boasted Of Hanging Out With The ‘First Guy To Storm The Capital’ Loses Libel Suit Against Person Who Pointed This Out
Utah Promises That It’s Going To Sue Social Media For Being Bad For Kids
Biden FCC, Like Trump FCC, Spends A Disproportionate Amount Of Time Hyperventilating About China