GeoAI and the Law Newsletter
Keeping geospatial professionals informed on the legal and policy issues that will impact GeoAI
Summary of Recent Developments in GeoAI and the Law
Periodically, the newsletter will focus on a legal issue of particular interest to the GeoAI community. This issue will focus on intellectual property rights, specifically copyright. As discussed in the Deep Dive below, the issue of copyright in geospatial information can be complex given the numerous types of geospatial information, the different ways geospatial information is stored or visualized (e.g., a data base, a map, an image) as well as the level of copyright protection associated with geospatial information.
There are two ways in which the issue of intellectual property often arises in generative AI. The first is whether the use of materials protected by copyright for training AI models is protected by the “fair use” doctrine of copyright (discussed below). The second is whether the content generated by AI can receive copyright protection. This issue will focus on the former.
Recommended Reading
The recommended readings below are links to recent court decisions or pleadings concerning the alleged use in developing AI models of materials subject to copyright. The reader will note that the cases cover a wide range of content, including books, legal databases, music, and images. While the pleadings and court decisions are technical from a legal standpoint, they are very useful in understanding the complexity of the issue.
Thomson Reuters Enter. Ctr. GmbH v. Ross Intelligence Inc. - The background of the case includes allegations Ross Intelligence Inc. used plaintiff’s copyrighted materials to train a competing artificial intelligence legal search platform. The court granted in part and denied in part the defendant’s motion to dismiss.
Andersen vs Stability AI, Ltd . - The plaintiffs challenge the creation and use of "Stable Diffusion," an AI software product that was allegedly trained on their copyrighted artworks to produce images in the style of specific artists. The court granted the motion to dismiss but allowed the right to amend the complaint.
Authors Guild v. OpenAI, Inc. An amended complaint by plaintiff in lawsuit brought by the Authors Guild and several authors against OpenAI and Microsoft alleging that they infringed upon the plaintiffs' copyrights by copying works wholesale to train large language models (LLMs).
Getty Images, Inc. v. Stability AI, Inc. Amended complaint in lawsuit filed by Getty Images against Stability AI, Inc. The complaint alleges Stability AI used more than 12 million Getty Images photos to train the model for Stability’s image generator. The claim includes copyright infringement, trademark infringement, unfair competition, trademark dilution, DMCA violations, and state law claims.
Tremblay v. OpenAI, Inc. Book authors allege that their copyrighted works were used to train OpenAI's LLMs, including ChatGPT. Court granted in part and denied in part plaintiff’s motion to dismiss.
Huckabee v. Bloomberg L.P. The plaintiffs allege in the complaint that defendants trained LLMs on a dataset of information scraped from a large collection of approximately 183,000 pirated eBooks.
Concord Music Grp., Inc. v. Anthropic PBC Complaint filed by several music publishers suing Anthropic PBC alleging that its AI system uses significant portions of original copyrighted lyrics when prompted by users.
The N.Y. Times Co. v. Microsoft Corp The New York Times Company brought a complaint against Microsoft Corporation and various OpenAI entities alleging copyright infringement, vicarious and contributory copyright infringement, violations of the DMCA, common law unfair competition, and trademark dilution. The complaint claims that the defendants unlawfully copied NYT content in building LLMs.
The Deep Dive
Recently, I listened to a panel of lawyers discussing the copyright issues associated with training generative AI models. Some of the key points from the panel included:
· While several of the cases listed above in Recommended Reading originally were based on plaintiffs’ claims that the outputs were copies of the original works, judges allowed them to amend their complaints to focus on whether the use of the material for training were permitted “fair use” under copyright law.
· The four elements in determining whether the use of copyrighted materials qualifies as “fair use” (i.e., (i) purpose of the use, (ii) nature of the underlaying work; (iii) substantiality of the material used and (iv) impact of the use on the content creator’s market) were intended to be of equal value, in practice courts tended to focus on one as being more important. In the past courts gave more weight to the economic impact that the use had on the intended market of the original content creator. More recently in finding fair use the courts have focused more on whether the use was transformative in nature.
· A number of previous cases which were not related to AI could impact the courts’ decisions on these matters. These include cases involving Google Books ( e.g., Authors Guild v Google, Inc.) , video games (e.g., Sega Enterprise Ltd v. Accolade, Inc.), VCRs (e.g., Sony Corp. of America v. Universal City Studios, Inc.) and Andy Warhol (e.g., Andy Warhol Foundation for the Visual Arts, Inc. v. Goldsmith).
· The courts’ decisions on whether the use of material for training models could differ based upon how the material is being used in the AI system (e.g., fine tuning, diffusion).
· There was general agreement that simply using an LLM is unlikely to result in a copyright violation, but that publishing or copying an output that was later found to have been improperly generated from copyrighted material could. As a result, a number of artists were not using LLMs until there was greater certainty in the law.
Of note, given the complexity of the issue, none of the panelists were willing to predict how the courts would ultimately rule on this issue.
Copyright issues associated with geospatial information are more complicated than other types of content, such as software, books, or music. For example, copyright protects original works of authorship, such as images. It also offers some protection for maps. However, it does not protect facts, which constitutes many types of geospatial information, although there may be limited protection in the arrangement or the selection of the facts.
In addition, geospatial products and services often created using information collected or acquired from a variety of sources, each of which may have different levels of copyright protection. The issue is further complicated by the global nature of the geospatial ecosystem: the protection of copyrighted materials from AI systems may be different under U.S. law than it will be in other jurisdictions.
Resolution of the cases noted above, as well as others that may be brought in the future, will help set the rules on the use of copyrighted material to train AI models. Therefore, our community should follow these cases with interest. However, because of the unique aspects of geospatial information, some aspects may be unresolved for the geospatial community even after these cases are decided.