← Return to search results
Back to Prindle Institute

When Is Fair Use “Fair” for AI (and When Is It “Use”)?

The Internet Archive recently lost a high-profile case. Here’s what happened: the Open Library, a project run by the Internet Archive, uploaded digitized versions of books that it owned, and loaned them out to users online. This practice was found to violate copyright law, however, since the Internet Archive failed to procure the appropriate licenses for distributing e-books online. While the Internet Archive argued that its distribution of digital scans of copyrighted works constituted “fair use,” the judge in the case was not convinced.

While many have lamented the court’s decision, others have wondered about the potential consequences for another set of high-profile fair use cases: those concerning AI models training on copyrighted works. Numerous copyright infringement cases have been brought against AI companies, including a class-action lawsuit brought against Meta for training their chatbot using authors’ books without their permission, and a lawsuit from record labels against AI music-generating programs that train on copyrighted works of music.

Like the Internet Archive, AI companies have also claimed that their use of copyrighted materials constitutes “fair use.” These companies, however, have a potentially novel way to approach their legal challenges. While many fair use cases center around whether the use of copyrighted materials is “fair,” some newer arguments involving AI are more concerned with a different kind of “use.”

“Fair use” is a legal concept that attempts to balance the rights of copyright holders with the ability of others to use those works to create something new. Quintessential cases in which it is generally considered “fair” when someone uses copyrighted materials include criticism, satire, educational purposes, or other ways that are considered “transformative,” such as in the creation of art. These conditions have limits, though, and lawsuits are often fought in the gray areas, especially when it is argued that the use of the material will adversely affect the market for the original work.

For example, in the court’s decision against the Internet Archive, the judge argued that uploading digital copies of books failed to be “transformative” in any meaningful sense and that doing so would likely be to the detriment of the original authors – in other words, if someone can just borrow a digital copy, they are less likely to buy a copy of the book. It’s not clear how strong this economic argument is; regardless, some commentators have argued that with libraries in America facing challenges in the form of budget cuts, political censorship, and aggressive licensing agreements from publishers, there is a real need for the existence of projects like the Open Library.

While “fair use” is a legal concept, there is also a moral dimension to the ways that we might think it acceptable to use the work of others. The case of the Internet Archive arguably shows how these concepts can come apart: while the existing law in the U.S. seems to not be on the side of the Open Library, morally speaking there is certainly a case to be made that people are worse off for not having access to its services.

AI companies have been particularly interested in recent fair use lawsuits, as their programs train on large sets of data, much of which is used without permission or a licensing agreement from the creators. While companies have argued that their use of these data constitutes fair use, some plaintiffs have argued they violate fair use law, both in terms of not being sufficiently transformative, and in terms of competing with the original copyright holder.

For example, some music labels have argued that music-generating AI programs often produce content that is extremely similar, or in some cases identical to existing music. In one case, an AI music generator reproduced artist Jason Derulo’s signature tag (i.e., that time when he says his name in his songs so you know it’s by him), a clear indication that the program was copying an existing song.

Again, we can look at the issue of fair use from both a legal and moral standpoint. Legally, it seems clear that when an AI program produces text verbatim from its source, it is not being transformative in any meaningful way. Many have also raised moral concerns around the way that AI programs use artistic materials, both around work being used without permission, as well as in ways that they specifically object to.

But there is an argument from AI defenders around fair use that has less to do with what is “fair” and how copyrighted information is “used”: namely, that AI programs “use” content they find online in the same way that a person does.

Here is how such an argument might go:

-There is nothing morally or legally impermissible about a person reading a lot of content, watching a lot of videos, or listening to a lot of music online, and then using that information as knowledge or inspiration when creating new works. This is simply how people learn and create new things.

-There is nothing specifically morally or legally significant about a person profiting off of the creations that result from what they’ve learned.

-There is nothing morally or legally significant about the quantity of information one consumes or how fast one consumes it.

-An AI is capable of reading a lot of content, watching a lot of videos, and listening to a lot of music online, and using that information as knowledge or inspiration when creating new works.

-The only relevant difference between the way that AI and a person use information to create new content is the quantity of information that an AI can consume and the speed at which it consumes it.

-However, since neither quantity nor speed are relevant moral or legal factors, AI companies are not doing anything impermissible by creating programs that use copyrighted materials online when creating new works.

Arguments of this form can be found in many places. For example, in an interview for NPR:

Richard Busch, lawyer who represents artists who have made copyright claims against other artists, argues: “How is this different than a human brain listening to music and then creating something that is not infringing, but is influenced.”

Similarly, from the blog of AI music creator Udio:

Generative AI models, including our music model, learn from examples. Just as students listen to music and study scores, our model has “listened” to and learned from a large collection of recorded music.

While these arguments also point to the originality of the final creation, a crucial component of their defense lies in how AI programs “use” copyrighted material. Since there’s nothing inherently inappropriate about a person consuming a lot of information, processing it, getting inspired by it, and producing something as a result, nor should we think it inappropriate for an AI to do the same things.

There have, however, been many worries raised already with inappropriate personification of AI, from concerns around AI being “conscious,” to downplaying errors by referring to them as “hallucinations.” In the above arguments, these personifications are more subtle: AI-defenders talk in terms of the programs “listening,” “creating,” “learning,” and “studying.” No one would begrudge a human being for doing these things. Importantly, though, these actions are the actions of human beings – or, at least, of intelligent beings with moral status. Uncritically applying them to computer programs thus masks an important jump in logic that is not warranted by what we know about the current capabilities of AI.

There are a lot of battles to be fought in terms of what constitute truly “transformative” works in lawsuits against AI companies. Regardless, part of the ongoing legal and moral discussions will undoubtedly need to shift their focus to new questions about what “use” means when it comes to AI.

The Social Justice of Copyrights and “Public Domain Day”

photograph of Duke Ellington record

In addition to starting a new calendar year, January 1st marks “Public Domain Day” when copyright restrictions expire for a new batch of artworks, thereby allowing new audiences to view them more easily and new artists to adapt them without needing special permission from the copyright holder. This year, the United States saw certain works from Buster Keaton, Gertrude ‘Ma’ Rainey, Duke Ellington, Virginia Woolf, Agatha Christie, and more enter the public domain, including the classic jazz song “Sweet Georgia Brown” and F. Scott Fitzgerald’s famous book The Great Gatsby.

On the one hand, it might seem like increasing accessibility to cultural artifacts is simply obviously good; given how many high school English classrooms rely on battered copies of Fitzgerald’s story, for example, we can see immediate benefits (both aesthetic and practical) to making it easier and cheaper to purchase new books. But, taken to its logical conclusion, this kind of argument seems to suggest that it might always be necessary for artworks and artifacts to be so accessible. If Gatsby really is so valuable, and if it is so embedded within American culture that it is often called “the great American novel,” then why should Americans have had to pay to read it in the first place? Put differently: why is The Great Gatsby only just now entering the public domain?

In brief, the concept of a copyright offers two related basic protections:

  1. It ensures that artists are compensated for the work that they perform, in a way that
  2. Ensures that society will continually benefit from the work of new artists (who, following from (1), will feel free to pursue their art).

This is why, for example, the Constitution specifically grants Congress the power to “promote the progress of science and useful arts, by securing for limited times to authors and inventors the exclusive right to their respective writings and discoveries.” Basically, in theory, copyrights work to level the social playing field a bit so that artists can (at least potentially) enjoy sufficient financial security to be able to practice their art. In effect, this makes copyrights a matter of social justice, since the people who benefit from these protections the most are precisely those from less-affluent or otherwise disadvantaged backgrounds. Although F. Scott Fitzgerald was not exactly socially disadvantaged, the person aiming to write the next great American novel could easily be discouraged from doing so without the hope of protected financial recompense for their labor offered by the copyright system. That is to say: aspiring writers might instead spend their energy towards non-artistic ends if their Gatsby was to simply immediately enter the public domain without helping the writer to, say, buy groceries.

To illustrate, imagine two people who both have an interest and talent for music: Thomas is born to a wealthy family in Hollywood, while Susan grows up in a lower-middle-class family in the Ozarks. Even if copyrights don’t exist, Thomas still has the luxury to pursue his art to his heart’s content: his family’s wealth offers him a level of comfort that shields him from the risk of “wasting time” on a hobby with no guarantee of compensation. The same cannot be said of Susan so easily: while she might still have plenty of personal reasons for playing music on her own, if the realities of her social position, say, require her to work a full-time job in order to provide for basic necessities, then she would be taking on considerable risk to herself if she instead chooses to devote her time to her art without any real guarantee that her music could offer her a profitable career. In principle, copyright laws offer Susan the promise of some financial protection such that if her art ends up becoming profitable, then she will be able to uniquely enjoy the monetary fruits of her labor without other artists being allowed to copy her work (at least for a time); it’s true that Thomas gets this benefit too, but notice that it doesn’t really affect him — he already had the financial protection to do as he liked with his art in the first place.

So, philosophically speaking, copyrights serve as a mechanism to help underwrite the kind of equality that John Rawls talks about with his first principle of justice: in explaining his view of a free and fair, egalitarian society in A Theory of Justice, Rawls argues that “each person is to have an equal right to the most extensive total system of equal basic liberties compatible with a similar system of liberty for all.” Insofar as copyrights can serve to more fairly distribute opportunities to develop artistic skill and create artworks, they might be thought of as components of a just society. Without protections like this in place, it would become, in principle, roughly impossible for anyone not born into privilege to pursue a career in the arts.

It’s worth noting that this is also why artists cannot copyright “generic concepts” or natural elements of normal life: a copyright is only valid for unique artistic creations. In mid-2020, the estate of Sir Arthur Conan Doyle sued Netflix over the depiction of Sherlock Holmes in its film Enola Holmes; while many of Doyle’s stories involving the character of Holmes have entered the public domain, they all tend to present Holmes as a generally cold and unemotional person. Because it is Doyle’s later stories (that are still under copyright) that see Holmes display more warmth and kindness, the caring demeanor the detective shows his younger sister in the Netflix film provoked the copyright-holder to sue. However, the generally-ridiculed lawsuit was settled out of court in December, presumably because “warmth and kindness” are hardly unique artistic creations.

But this also evidences the problem with the other side of copyright laws: artworks are importantly different than commodities or other products for sale. Fitzgerald and Doyle weren’t just “doing their jobs,” for example, when they wrote The Great Gatsby and the Sherlock Holmes stories: they were effectively contributing to the cultural fabric of our society and the artworks that we collectively use to texture our social fabric with shared points of understanding and reference. It might be argued that, just as “warmth and kindness” are ubiquitous to the point of being un-copyrightable, the cultural familiarity of a character like “Sherlock Holmes” is (or is becoming) similarly un-copyrightable.

Such is the argument for “Public Domain Day.” Only the most radical defenders of the public domain would argue that copyrights are, in principle, problematic: indeed, artists both need and deserve to be secure to create their art (consider also: how else might audiences expect to come by new art to appreciate?). However, over time, the sedimentation of individual artifacts into the cultural consciousness makes a unique property claim on them less clearly valid — particularly after the original artist’s death. Though details differ by country, it is common now for copyrights to extend (in general) for either fifty or seventy years after the death of the artist, allowing both the original creator and their dependents to uniquely benefit from the artwork for a limited amount of time before legal ownership of the artifact is distributed collectively.

Rawls also carves out a space for thinking about copyrights in this way within his Difference Principle that allows for some individuals to benefit more than others if that inequality also serves to benefit the least advantaged in society: presumably promoting the further and continued creation of new artworks (as copyrights are designed to do) is just such a public benefit. But once the general welfare is no longer upheld by the existence of a copyright, it would be just for the copyright to dissolve — as indeed we see demonstrated and celebrated each year on Public Domain Day.

(A crucial note: you may have noticed my repeated hedging in previous paragraphs as I have defended copyright law “in principle” or “philosophically.” This is because the actual practice of copyright law in the United States is fraught with problematic and unfair issues that Rawlsian principles of justice would struggle to support. Indeed, the extension of copyright terms seen in the last few decades, the corporate interests apparently motivating such legislation, and other threats to a shrinking public domain (as well as unique questions posed by new forms of art and media) are all issues that deserve both philosophical and legislative attention in a way that is far more complicated than the simple picture I’ve sketched in this short article!)

Still, copyrights play an important part for anyone looking to protect the financial interests they have bound up in their art; for the rest of us, Public Domain Day grants us the green light to continue bearing back into the past to bring it forward into today.

3D Scans, Archaeological Sites, and “Digital Colonialism”

Photo of the Palmyra ruins in Syria

During the height of its power, the Islamic State in Iraq and Syria (ISIS) destroyed and looted numerous cultural heritage sites under its control. In January 2017, it was reported that ISIS had destroyed two ancient structures in Palmyra. Cultural heritage sites are also prone to natural disasters. An earthquake that hit an ancient city in Myanmar in 2016 damaged numerous temples located there.

Continue reading “3D Scans, Archaeological Sites, and “Digital Colonialism””

Copyrighting Anne Frank’s Diary

On January 1, Anne Frank’s diary was published online by more than one person, despite outcry from the Anne Frank Fonds, the foundation founded by Anne’s father. The argument of the publishing academics was that more than 70 years have passed since the death of Anne Frank in Bergen-Belsen concentration camp, which sends the work into the public domain across most of Europe. However, the foundation argues that Otto Frank, as editor and publisher, held the copyright. He died in 1980, making the work still under copyright. Additionally, the translator that worked with Otto Frank on the diary, another copyright holder, is still alive.

Continue reading “Copyrighting Anne Frank’s Diary”