AI

January 16, 2024

Sign of the Times: The Battle Against AI Goes Big

I closed out 2023 by writing about one lawsuit over AI and copyright and we’re starting 2024 the same way. In that last post, I focused on some of the issues I expect to come up this year in lawsuits against generative AI companies, as exemplified in a suit filed by the Authors Guild and some prominent novelists against OpenAI (the company behind ChatGPT). Now, the New York Times Company has joined the fray, filing suit late in December against Microsoft and several OpenAI affiliates. It’s a big milestone: The Times Company is the first major U.S. media organization to sue these tech behemoths for copyright infringement.

As always, at the heart of the matter is how AI works: Companies like OpenAI ingest existing text databases, which are often copyrighted, and write algorithms (called large language models, or LLMs) that detect patterns in the material so that they can then imitate it to create new content in response to user prompts.

The Times Company’s complaint, which was filed in the Southern District of New York on December 27, 2023, alleges that by using New York Times content to train its algorithms, the defendants directly infringed on the New York Times’ copyright. It further alleges that the defendants engaged in contributory copyright infringement and that Microsoft engaged in vicarious copyright infringement. (In short, contributory copyright infringement is when a defendant was aware of infringing activity and induced or contributed to that activity; vicarious copyright infringement is when a defendant could have prevented — but didn’t — a direct infringer from acting, and financially benefits from the infringing activity.) Finally, the complaint alleges that the defendants violated the Digital Millennium Copyright Act by removing copyright management information included in the New York Times’ materials, and accuses the defendants of engaging in unfair competition and trademark dilution.

The defendants, as always, are expected to claim they’re protected under “fair use” because their unlicensed use of copyrighted content to train their algorithms is transformative.

What all this means is that while 2023 was the year that generative AI exploded into the public’s consciousness, 2024 (and beyond) will be when we find out what federal courts think of the underlying processes fueling this latest data revolution.

I’ve read the New York Times’ complaint (so you don’t have to) and here are some takeaways:

The Times Company tried (unsuccessfully) to negotiate with OpenAI and Microsoft (a major investor in OpenAI) but were unable to reach an agreement that would “ensure [The Times] received fair value for the use of its content.” This likely hurts the defendants’ claims of fair use.
As in the other lawsuits against OpenAI and similar companies, there’s an input problem and an output problem. The input problem comes from the AI companies ingesting huge amounts of copyrighted data from the web. The output problem comes from the algorithms trained on the data spitting out material that is identical (or nearly identical) to what they ingested. In these situations, I think it’s going to be rough going for the AI companies’ fair use claim. However, they have a better fair use argument where the AI models create content “in the style of” something else.
The Times Company’s case against Microsoft comes, in part, from the fact that Microsoft is alleged to have “created and operated bespoke computing systems to execute the mass copyright infringement . . .” described in the complaint.
OpenAI allegedly favored “high-quality content, including content from the Times” in training its LLMs.
When prompted, ChatGPT can regurgitate large portions of the Times’ journalism nearly verbatim. Here’s an example taken from the complaint showing the output of ChatGPT on the left in response to “minimal prompting,” and the original piece from the New York Times on the right. (The differences are in black.)

Excerpt from The New York Times Company's Complaint

According to the New York Times this content, easily accessible for free through OpenAI, would normally only be available behind their paywall. The complaint also contains similar examples from Bing Chat (a Microsoft product) that go far beyond what you would get in a normal search using Bing. (In response, OpenAI says that this kind of wholesale reproduction is rare and is prohibited by its terms of service. I presume that OpenAI has since fixed this issue, but that doesn’t absolve OpenAI of liability.)
Because OpenAI keeps the design and training of its GPT algorithms secret, the confidentiality order here will be intense because of the secrecy around how OpenAI created its LLMs.
While the New York Times Company can afford to fight this battle, many smaller news organizations lack the resources to do the same. In the complaint, the Times Company warns of the potential harm to society of AI-generated “news,” including its devastating effect on local journalism which, if the past is any indication, will be bad for all of us.

Stay tuned. OpenAI and Microsoft should file their response, which I expect will be a motion to dismiss, in late-February or so. When I get those, I’ll see you back here.

December 19, 2023

Big Name Authors Battle the Bots

This year has brought us some of the early rounds of the fights between creators and AI companies, notably Microsoft, Meta, and OpenAI (the company behind ChatGPT). In addition to the Hollywood strikes, we’ve also seen several lawsuits between copyright owners and companies developing AI products. The claims largely focus on the AI companies’ creation of “large language models” or “LLMs.” (By way of background, LLMs are algorithms that take a large amount of information and use it to detect patterns so that it can create its own “original” content in response to user prompts.)

Among these cases is one filed by the Authors Guild and several prominent writers (including Jonathan Franzen and Jodi Picoult) in the Southern District of New York. It alleges OpenAI ingested large databases of copyrighted materials, including the plaintiffs’ works, to train their algorithms. In early December, the plaintiffs amended their complaint to add Microsoft as a defendant alleging that Microsoft knew about and assisted OpenAI in its infringement of the plaintiffs’ copyrights.

Because it is the end of the year, here are five “things to look for in 2024” in this case (and others like it):

What will defendants argue on fair use and how will the Supreme Court’s 2023 decision in Goldsmith impact this argument? (In 2023 the SCOTUS ruled that Andy Warhol’s manipulation of a photograph by Lynn Goldsmith was not transformative enough to qualify as fair use.)
Does the fact that the output of platforms like ChatGPT isn’t copyrightable have any impact on the fair use analysis? The whole idea behind fair use is to encourage subsequent creators to build on the work of earlier creators, but what happens to this analysis when the later “creator” is merely a computer doing what it was programmed to do?
Will the fact that OpenAI recently inked a deal with Axel Springer (publisher of Politico and Business Insider) to allow OpenAI to summarize its news articles as well as use its content as training data for OpenAI’s large language models affect OpenAI’s fair use argument?
What impact, if any, will this and other similar cases have on the business model for AI? Big companies and venture capital firms have invested heavily in AI, but if courts rule they must pay authors and other creators for their copyrighted works it dramatically changes the profitability of this model. Naturally, tech companies are putting forth numerous arguments against payment, including how little each individual creator would get considering how large the total pool of creators is, how it would curb innovation, etc. (One I find compelling is the idea that training a machine on copyrighted text is no different from a human reading a bunch of books and then using the knowledge and sense of style gained to go out and write one of their own.)
Is Microsoft, which sells (copyrighted) software, ok with a competitor training its platform on copyrighted materials? I’m guessing that’s probably not ok.

These are all big questions with a lot at stake. For good and for ill, we live in exciting times, and in the arena of copyright and IP law I guarantee that 2024 will be an exciting year. See you then!

November 21, 2023

Can NO FAKES be for Real?

This week, I’m taking a break from talking about court cases and instead focusing on a draft bill aimed at creating a federal right of publicity that was introduced in October by a bipartisan group of Senators. A quick refresher: the right of publicity allows an individual to control the use of their voice, and laws or cases governing this right exist in about two-thirds of the states.

Now, with generative AI and “deepfake” technology, celebrities and entertainment companies are pushing for greater protection against the creation of unauthorized digital replicas of a person’s image, voice, or visual likeness. And the Senate, it appears, is responding, raising concerns among digital rights groups and others about First Amendment rights and limits on creative freedom.

Before diving into the specifics of the bill and its potential implications, I want to step back and talk about the underlying reasons for intellectual property laws. These laws are the subject of entire law school classes (I took several of them), but I can quickly summarize two fundamental reasons why they exist. The first is to encourage artistic works and inventions, an idea that can be found in the U.S. Constitution. The idea is that allowing creators (in the case of copyright law) and inventors (in the case of patent law) to exclusively reap the economic benefits of their work will incentivize people to make art and invent useful things. Notably, both copyrights and patents are in effect for a limited amount of time: for patents, 20 years from the date of the application, while copyrights run for the life of the creator plus 70 years (note that length; it’s going to come up again).

The second reason is to prevent consumer confusion. This is the central concern of trademark and unfair competition laws, which are intended to ensure that no one other than the company associated with a particular good or service is selling that good or service.

The idea behind the right of publicity (you can read more about it in the context of generative AI here), includes a dash of both of these rationales. It ensures that individuals can profit from their investment in their persona by preventing others from using their name, likeness, voice, etc., without their permission. It also prevents brands from claiming someone endorsed a product without that person’s consent.

With generative AI and the ease with which anyone can now create a digital replica of a celebrity to endorse a product or perform a song, artists and entertainment companies are worried that the current patchwork of state laws isn’t enough. Hence, the Nurture Originals, Foster Art, and Keep Entertainment Safe Act of 2023 or the NO FAKES Act of 2023, which, if enacted, would create a federal right of publicity. (A side question: in hiring staff, do Members of Congress test job applicants’ ability to come up with wacky bill titles that can be made into acronyms? Because this one certainly took some legitimate skill.)

The bill protects against the creation of an unauthorized “digital replica,” which the NO FAKES Act describes as: “a newly created, computer-generated, electronic representation of the image, voice, or visual likeness of an individual that is [nearly indistinguishable] from the actual image, voice, or visual likeness of an individual; and is fixed in a sound recording or an audiovisual work in which that individual did not actually perform or appear.”

In other words, NO FAKES bars using a computer to create an audiovisual work or a recording that looks or sounds very much like a real person when that person has not consented. This proposed right bars the creation of a digital replica during a person’s lifetime and for 70 years after death (the same as existing copyright laws). In the case of a dead person, the person or entity that owns the rights to the deceased’s publicity rights (often, the deceased’s heirs) would have to consent to the creation of a digital replica.

If NO FAKES is passed, anyone who creates an unauthorized digital replica can be sued by the person who controls the rights; the rights holder can also sue anyone, like a website or streaming platform, who knowingly publishes, distributes, or transmits a digital replica without consent. This is true even if the work includes a disclaimer stating the work is unauthorized.

That said, the Act as currently drafted does include some exceptions intended to protect the First Amendment. For example, NO FAKES states that it is not a violation of the Act to create a digital replica that is used as part of a news broadcast or documentary or for purposes of “comment criticism, scholarship, satire, or parody.”

Some other things to note:

The right to control the creation of a digital replica does not extend to images that are unaccompanied by audio.
The draft bill states that the right to control digital replicas “shall be considered to be a law pertaining to intellectual property for the purposes of section 230(e)(2) of the Communications Act of 1934. This means that Internet service providers cannot rely on Section 230 to avoid liability.

Now, it is likely the draft will have undergone significant amendments and revisions if and when it is passed. As mentioned above, digital rights groups and others worry that the right of publicity can be used to litigate against speech protected by the First Amendment, as public figures in the past have tried when they don’t like something that has been said about them in the media.

To me, the Act seems a bit suspicious. You may notice I’ve stressed how the Act extends protection against digital replicas to 70 years post-mortem, the same exact length as copyright protection. Isn’t this expansiveness a bit much considering the current state of play is no federal right of publicity at all? The extreme length of the proposed protection, coupled with the Act eliminating the use of disclaimers as a shield for liability, suggests NO FAKES is less about protecting the public and more designed to prolong celebrities’ and entertainment companies’ abilities to profit. After all, the right to publicity created in the NO FAKES Act can be sold by an actor or their heirs to a company like, say, a movie studio… that could then, in theory, continue to feature digital replicas of the aged or deceased actor in their films unchallenged for seven decades after death. Thelma and Louise 4: Back From the Abyss is coming, and Brad Pitt won’t look a day over 30.

Good, perhaps, for Brad Pitt. The rest of us, maybe not.

November 7, 2023

Does Machine Learning Violate Human Copyright?

On October 30, 2023, a judge in the Northern District of California ruled in one of the first lawsuits between artists and generative AI art platforms for copyright infringement. While the judge quickly dismissed some of the Plaintiffs’ claims, the case is still very much alive as he is allowing them to address some of the problems in their case and file amended complaints.

So what’s it all about? Three artists are suing Stability AI Ltd. and Stability AI, Inc. (collectively, “Stability”), whose platform, Stable Diffusion, generates photorealistic images from text input. To teach Stable Diffusion how to generate images, Stability’s programmers scrape (i.e., take or steal, depending on how charitable you’re feeling) the Internet for billions of existing copyrighted images — among them, allegedly, images created by the Plaintiffs. End users (i.e., people like you and me) can then use Stability’s platform to create images in the style of the artists whose work the AI has been trained.

In addition to Stability, the proposed class action suit on behalf of other artists also names as defendants Midjourney, another art generation AI that incorporates Stable Diffusion, and DeviantArt, Inc., an online community for digital artists, which Stability scraped to train Stable Diffusion, and which also offers a platform called DreamUp that is built on Stable Diffusion.

The Plaintiffs — Sarah Andersen, Kelly McKernan, and Karla Ortiz — allege, among other things, that Defendants infringed on their copyrights, violated the Digital Millennium Copyright Act, and engaged in unfair competition.

In ruling on Defendants’ motion to dismiss, U.S. District Judge William Orrick quickly dismissed the copyright claims brought by McKernan and Ortiz against Stability because they hadn’t registered copyrights in their artworks — oops.

Anderson, however, had registered copyrights. Nonetheless, Stability argued her claim of copyright infringement should be dismissed because she couldn’t point to specific works that Stability used as training images. The Court rejected that argument. It concluded that the fact she could show that some of her registered works were used for training Stable Diffusion was enough at this stage to allege a violation of the copyright act.

The judge, however, dismissed Anderson’s direct infringement claim against DeviantArt and Midjourney. With DeviantArt, he found that Plaintiffs hadn’t alleged that DeviantArt had any affirmative role in copying Anderson’s images. For Midjourney, the judge found that Plaintiffs needed to clarify whether the direct infringement claim was based on Midjourney’s use of Stable Diffusion and/or whether Midjourney independently scraped images from the web and used them to train its product. Judge Orrick is allowing them to amend their complaint to do so.

Because Orrick dismissed the direct infringement claims against DeviantArt and Midjourney, he also dismissed the claims for vicarious infringement against them. (By way of background, vicarious infringement is where a defendant has the “right and ability” to supervise infringing conduct and has a financial interest in that conduct.) Again, however, the Court allowed Plaintiffs to amend their complaint to state claims for direct infringement against DeviantArt and Midjourney, and also to amend their complaint to allege vicarious infringement against Stability for the use of Stable Diffusion by third parties.

Orrick warned the Plaintiffs (and their lawyers) that he would “not be as generous with leave to amend on the next, expected rounds of motions to dismiss and I will expect a greater level of specificity as to each claim alleged and the conduct of each defendant to support each claim.”

Plaintiffs also alleged that Defendants violated their right of publicity, claiming that Defendants used their names to promote their AI products. However, the Court dismissed these claims because the complaint didn’t actually allege that the Defendants advertised their products using Plaintiffs’ names. Again, he allowed the Plaintiffs leave to amend. (The Plaintiffs originally tried to base a right of publicity claim on the fact that Defendants’ platforms allowed users to produce AI-generated works “in the style of” their artistic identities. An interesting idea, but Plaintiffs abandoned it.)

In addition, DeviantArt moved to dismiss Plaintiffs’ right of publicity claim on grounds that DeviantArt’s AI platform generated expressive content. Therefore, according to DeviantArt, the Court needed to balance the Plaintiff’s rights of publicity against DeviantArt’s interest in free expression by considering whether the output was transformative. (Under California law, “transformative use” is a defense to a right of publicity claim.) The Court found that this was an issue that couldn’t be decided on a motion to dismiss and would have to wait.

What are the key takeaways here? For starters, it is fair to say that the judge thought that Plaintiffs’ complaint was not a paragon of clarity. It also seems like the judge thought that Plaintiffs would have a hard time alleging that images created by AI platforms in response to user text input were infringing. However, he seemed to indicate that it was more likely to allow copyright infringement claims based on Stability’s use of images to train Stable Diffusion to proceed.

June 27, 2023

AI Face Replacement: A Class Act(ion)?

Intellectual property class action lawsuits have, historically, been relatively rare. But here, at the dawn of AI, everything is changing fast, and we already have what appears to be the first attempt at an AI-related class action: Young v. NeoCortext, Inc.

This action is currently pending in the Central District of California against the owners of Reface, a “deep fake” generative AI app that enables users to replace a celebrity’s face in a still photo from a film or TV with their own face. The app includes a searchable catalog that allows a user to select the star whose face they want to replace. This library includes images of Kyland Young — a finalist in season 23 of CBS’ Big Brother — who is seeking to represent a class of California residents including musicians, athletes, celebrities “and other well-known individuals” who have had their “name, voice, signature, photograph, or likeness” displayed in Reface.

Young alleges that Reface’s inclusion of his image violates his rights under California’s right of publicity statute. This law protects individuals against the unauthorized use of their image, name, or voice to advertise or sell a product. His claim hinges on a specific detail: Reface promotes paid subscriptions with a free version that allows users to generate an image with their face in the place of a celebrity. Images generated by the free version are watermarked with Reface’s logo and say “made with Reface app.” According to Young, this amounts to an ad for the paid version of the Reface app. Thus, he claims that Reface’s owner is exploiting his image (and the image of other celebrities and demi-celebrities) to encourage users to purchase the paid version of the app, which brings the app within the ambit of California’s right of publicity statute.

Lawyers for Neocortext, which owns the app, have moved to dismiss the complaint. They argue, among other things, that Plaintiff’s claims are preempted by the Copyright Act and are barred by the First Amendment.

On preemption, Defendant argues that since images of Young used on the app are owned by CBS, not Young, any action for the unauthorized use of these images would have to be brought by CBS, not Plaintiff. It argues that CBS’ claims (if any) would sound in copyright infringement, not a violation of the right of publicity. It seems likely that the Defendant will prevail on this argument.

Even if the Defendant doesn’t prevail on this argument and the case survives the motion to dismiss, this copyright issue could create problems certifying a class. One issue courts consider in determining whether a suit can be heard as a class action is “commonality.” This requires judges to consider if the potential class members (in this case, other celebrities) are likely to have more issues in common than not. The possibility that some claims might be preempted by copyright law while others are not might lead the judge to conclude that common issues don’t predominate. This could preclude the certification of the action as a class action.

Defendant also argues that Plaintiff’s claim should be dismissed because it “violates the expressive rights of Defendant and its users that are guaranteed by the First Amendment.” Here, Defendant claims that modifying celebrity images to convey an idea or message can be an exercise of creative self-expression within the scope of the First Amendment, and thus Reface performs a “transformative use,” which brings it outside of the ambit of California’s right of publicity statute.

All in all, at least on copyright preemption, Defendant’s arguments seem more convincing.

With that said, this lawsuit points to how AI is making it easier to manipulate celebrities’ images. This will undoubtedly lead to more right of publicity lawsuits.