GeoAI and the Law Newsletter
Keeping geospatial professionals informed on the legal and policy issues that will impact GeoAI.
Deep Dive
In today’s Deep Dive we will explore a court ruling in February in the case Thomson Reuters v. ROSS Intelligence which could have a significant impact on the use of material protected by copyright as training data. In this case, Thomson Reuters, the owner of the legal research platform Westlaw, sued ROSS Intelligence, a competitor that developed a legal research search engine using artificial intelligence, for copyright infringement. The central legal issue in the dispute is whether ROSS infringed Thomson Reuters's copyright by using Westlaw headnotes to train its AI search tool. ROSS raised several defenses against the copyright infringement claim, including fair use.
From an AI standpoint, the issue of fair use is important, as it has frequently been raised as justification for the use of material protected by copyright in training models. A court considers several factors in determining whether fair use applies. These factors include:
· Purpose and character of the use.
· Nature of the copyrighted work
· Amount and substantiality of the portion used
· Effect of the use upon the potential market
In this case, the court ultimately granted partial summary judgment to Thomson Reuters on copyright infringement and rejected ROSS's defense. Specifically, the court reasoned:
· ROSS's use was commercial and non-transformative. The court noted that although the headnotes were “turned into numerical data about the relationships among legal words to feed into its AI,” the ultimate purpose was to create a competing legal research tool, similar to Westlaw. In the court’s opinion, this distinguishes the case from instances of intermediate copying of computer code, noting that this case involved written words, and the copying was not "reasonably necessary to achieve the user's new purpose." However, the court cautioned that this analysis was limited to the non-generative AI before it, implying that the analysis might be different if the case involved generative AI.
· Nature of the copyrighted work: The court acknowledged that Westlaw's material, including headnotes, while more than minimal originality, is not the most creative type of work. However, the court, citing another case noted that this factor "has rarely played a significant role in the determination of a fair use dispute.”
· Amount and substantiality of the portion used: The court acknowledged that this factor leaned towards ROSS. However, the court also noted that taking several thousand headnotes could still constitute taking the "heart" of the work, even if it was a small percentage.
· Effect of the use upon the potential market: The court found that this was the "single most important element of fair use,” and it favored Thomson Reuters. The court determined that ROSS' use was intended to create a market substitute for Westlaw. Moreover, the court recognized a potential derivative market for training legal AI systems, and ROSS had not provided enough evidence to show that this market did not exist or would not be affected by its copying. The public interest in accessing the law does not give the public a right to Thomson Reuters's parsing of the law.
Based upon its analysis of these factors, the court ruled that the fair use defense failed.
This case holds significant relevance for geospatial professionals, particularly those involved in developing or deploying AI tools that rely on large datasets—such as satellite imagery or sensor data—for training purposes. The geospatial sector often relies on large, preexisting datasets, some of which may be proprietary or subject to copyright protection, in machine learning to analyze terrain, classify land use, or automate feature extraction.
This case serves as a cautionary example that using copyrighted material, even for seemingly innovative purposes like AI training, may not be protected under the fair use doctrine, especially if the use is commercial and results in a competing product. For geospatial professionals, this underscores the importance of carefully reviewing license agreements and copyright status when sourcing data for AI applications. Additionally, it signals that data providers in the geospatial space may begin asserting intellectual property rights more aggressively, particularly as new revenue streams, such as licensing data for AI training.