AI

November 19, 2024

OpenAI’s Texts and DMs: Business or Personal?

If you’ve been following this blog, you’re familiar with the copyright infringement cases the New York Times and the Authors Guild have brought against OpenAI, makers of ChatGPT. So familiar, in fact, I won’t summarize these suits again. You can find a prior post about these cases here. The current dispute is interesting, at least to me (social media + law = fun for a nerd like me!) because it is another data point on how courts grapple with the blurry line between business and personal communications on social media.

Taking a step back for the non-litigators and non-lawyers in the room: In litigation, the parties must exchange materials that could have a bearing on the case. This generally covers a pretty broad range of materials and requires each party to produce all such materials that are in its “possession, custody, or control.” A party can also subpoena a non-party to the case for relevant materials in the non-party’s “possession, custody, or control.” However, where possible, it’s generally better to get discovery materials from a party instead of a non-party.

Turning back to the cases against OpenAI, the Authors Guild asked the tech company to produce texts and social media direct messages from more than 30 current and former employees, including some of the company’s top executives. It claims these communications may shed light on the issues in the case.

OpenAI has pushed back strongly. It claims that its employees’ social media accounts and personal phones are, well, personal and, therefore, not in its control. It also contends the Guild’s request might intrude on these persons’ privacy. OpenAI also rejects the Guild’s assumption that OpenAI’s search of its internal materials relevant to the case will be inadequate without its employees’ and former employees’ texts and DMs. It sniffs that the Guild should wait until it receives OpenAI’s documents before presuming as much (how rude!).

The Authors Guild has responded by pointing to OpenAI employees’ posts on X (yes, formerly Twitter) that clearly indicate they used their “personal” social media for work purposes. Same goes for their phones which, while they may not be paid for by the company, seem to have been used to text about business.

So, who’s right here? For starters, it seems pretty likely that, at least for current OpenAI employees, OpenAI could just tell people to turn over DMs and text messages. Assuming the employees don’t object or refuse, this should be enough to establish that OpenAI has “control.” The fact that it seems that OpenAI hasn’t taken this basic step before refusing to produce DMs and text messages seems like a really good way to piss off the Magistrate Judge hearing this issue, especially if the employees violated OpenAI policies requiring work-related communications to take place on devices and accounts owned by the company (it should have such policies if it doesn’t!) or if the communications were clearly within the scope of an employee’s employment. Without that basic showing, it seems likely that the Authors Guild will prevail.

If it does (or if it doesn’t) there will be more about it here!

March 19, 2024

Let’s Talk About Trademarks (And AI)

I’ve posted quite a bit about the growing legal battles involving AI companies, copyright infringement, and the right of publicity. These are still early days in the evolution of AI so it’s hard to envision all the ways the technology will develop and be utilized, but I predict AI is going to come up against even more existing intellectual property laws — specifically, trademark law.

For example, in its lawsuit against Open AI and others (which I wrote about here), the New York Times Company alleged the Defendants engaged in trademark dilution. To take a step back, trademark dilution happens when someone uses a “famous” trademark (think Nike, McDonalds, UPS, etc.) without permission, in a way that weakens or otherwise harms the reputation of the mark’s owner. This could happen when an AI platform, in response to a user query, delivers flat-out wrong or offensive content and attributes it to a famous brand such as the New York Times. Thus, according to the Times’ complaint, when asked “what the Times said are ‘the 15 most heart-healthy foods to eat,’” Bing Chat (a Microsoft AI product) responded with, among other things, “red wine (in moderation).” However, the actual Times article on the subject “did not provide a list of heart-healthy foods and did not even mention 12 of the 15 foods identified by Bing Chat (including red wine).” Who knows where Bing got its info from, but if the misinformation and misattribution causes people to think less of the “newspaper of record,” that could be construed as trademark dilution.

There are, however, potential pitfalls for brands who want to use trademark dilution to push back against AI platforms. It’s difficult to discover, expensive to pursue and there can be a lot of ambiguity about whether a brand is “famous” and able to be significantly harmed by trademark dilution. In the New York Times’ case, the media giant has the resources to police the Internet and to file suits; nor should there be any dispute that is a “famous” brand with a reputation that is vitally important. But smaller companies may not have the resources to search for situations where AI platforms incorrectly attribute information, or have a platform visible enough to meaningfully correct the record. Plus, calculating the brand damage from AI “hallucinations” will be very difficult and costly. Also, this area of the law does nothing for brands that aren’t “famous.”

Another area where trademark law and AI seem destined to face off is under the sections of the Lanham Act — the Federal trademark law — that allows celebrities to sue for non-consensual use of their persona in a way that leads to consumer confusion, or others to sue for false advertising that influences consumer purchasing decisions. AI makes it pretty easy to manipulate a celebrity’s (or anyone’s) image or video to do and say whatever a user wants, which opens up all sorts of troublesome trademark possibilities.

Again, there are a couple of serious limitations here. For starters, the false endorsement prong likely only applies to celebrities or others who are well-known and does little to protect the rest of us. Perhaps more important (and terrifying), it seems likely that there will be significant issues in applying the Lanham Act’s provisions on false advertising in the context of deepfakes in political campaigns — like, for example, the recent robocall in advance of the New Hampshire primary that sounded like it was from President Biden. To avoid problems with the First Amendment, the Lanham Act is limited to commercial speech and thus will be largely useless for dealing with this type of AI abuse.

One other potentially interesting (and creepy) area where AI and trademark law might intersect is when it comes to humans making purchasing decisions through an AI interface. For example, a user tells a chatbot to order a case of “ShieldSafe disinfecting wipes,” but what shows up on their porch is a case of “ShieldPro disinfecting wipes” (hat tip to ChatGPT for suggesting these fictional names). While the mistake of a few letters might mean nothing to an algorithm (or even to a consumer who just wants to clean a toilet), it’s certainly going to anger a ShieldSafe Corp. that wants to prevent copycat companies from stealing their customers (and keep their business from going down that aforementioned toilet).

January 16, 2024

Sign of the Times: The Battle Against AI Goes Big

I closed out 2023 by writing about one lawsuit over AI and copyright and we’re starting 2024 the same way. In that last post, I focused on some of the issues I expect to come up this year in lawsuits against generative AI companies, as exemplified in a suit filed by the Authors Guild and some prominent novelists against OpenAI (the company behind ChatGPT). Now, the New York Times Company has joined the fray, filing suit late in December against Microsoft and several OpenAI affiliates. It’s a big milestone: The Times Company is the first major U.S. media organization to sue these tech behemoths for copyright infringement.

As always, at the heart of the matter is how AI works: Companies like OpenAI ingest existing text databases, which are often copyrighted, and write algorithms (called large language models, or LLMs) that detect patterns in the material so that they can then imitate it to create new content in response to user prompts.

The Times Company’s complaint, which was filed in the Southern District of New York on December 27, 2023, alleges that by using New York Times content to train its algorithms, the defendants directly infringed on the New York Times’ copyright. It further alleges that the defendants engaged in contributory copyright infringement and that Microsoft engaged in vicarious copyright infringement. (In short, contributory copyright infringement is when a defendant was aware of infringing activity and induced or contributed to that activity; vicarious copyright infringement is when a defendant could have prevented — but didn’t — a direct infringer from acting, and financially benefits from the infringing activity.) Finally, the complaint alleges that the defendants violated the Digital Millennium Copyright Act by removing copyright management information included in the New York Times’ materials, and accuses the defendants of engaging in unfair competition and trademark dilution.

The defendants, as always, are expected to claim they’re protected under “fair use” because their unlicensed use of copyrighted content to train their algorithms is transformative.

What all this means is that while 2023 was the year that generative AI exploded into the public’s consciousness, 2024 (and beyond) will be when we find out what federal courts think of the underlying processes fueling this latest data revolution.

I’ve read the New York Times’ complaint (so you don’t have to) and here are some takeaways:

The Times Company tried (unsuccessfully) to negotiate with OpenAI and Microsoft (a major investor in OpenAI) but were unable to reach an agreement that would “ensure [The Times] received fair value for the use of its content.” This likely hurts the defendants’ claims of fair use.
As in the other lawsuits against OpenAI and similar companies, there’s an input problem and an output problem. The input problem comes from the AI companies ingesting huge amounts of copyrighted data from the web. The output problem comes from the algorithms trained on the data spitting out material that is identical (or nearly identical) to what they ingested. In these situations, I think it’s going to be rough going for the AI companies’ fair use claim. However, they have a better fair use argument where the AI models create content “in the style of” something else.
The Times Company’s case against Microsoft comes, in part, from the fact that Microsoft is alleged to have “created and operated bespoke computing systems to execute the mass copyright infringement . . .” described in the complaint.
OpenAI allegedly favored “high-quality content, including content from the Times” in training its LLMs.
When prompted, ChatGPT can regurgitate large portions of the Times’ journalism nearly verbatim. Here’s an example taken from the complaint showing the output of ChatGPT on the left in response to “minimal prompting,” and the original piece from the New York Times on the right. (The differences are in black.)

Excerpt from The New York Times Company's Complaint

According to the New York Times this content, easily accessible for free through OpenAI, would normally only be available behind their paywall. The complaint also contains similar examples from Bing Chat (a Microsoft product) that go far beyond what you would get in a normal search using Bing. (In response, OpenAI says that this kind of wholesale reproduction is rare and is prohibited by its terms of service. I presume that OpenAI has since fixed this issue, but that doesn’t absolve OpenAI of liability.)
Because OpenAI keeps the design and training of its GPT algorithms secret, the confidentiality order here will be intense because of the secrecy around how OpenAI created its LLMs.
While the New York Times Company can afford to fight this battle, many smaller news organizations lack the resources to do the same. In the complaint, the Times Company warns of the potential harm to society of AI-generated “news,” including its devastating effect on local journalism which, if the past is any indication, will be bad for all of us.

Stay tuned. OpenAI and Microsoft should file their response, which I expect will be a motion to dismiss, in late-February or so. When I get those, I’ll see you back here.

December 19, 2023

Big Name Authors Battle the Bots

This year has brought us some of the early rounds of the fights between creators and AI companies, notably Microsoft, Meta, and OpenAI (the company behind ChatGPT). In addition to the Hollywood strikes, we’ve also seen several lawsuits between copyright owners and companies developing AI products. The claims largely focus on the AI companies’ creation of “large language models” or “LLMs.” (By way of background, LLMs are algorithms that take a large amount of information and use it to detect patterns so that it can create its own “original” content in response to user prompts.)

Among these cases is one filed by the Authors Guild and several prominent writers (including Jonathan Franzen and Jodi Picoult) in the Southern District of New York. It alleges OpenAI ingested large databases of copyrighted materials, including the plaintiffs’ works, to train their algorithms. In early December, the plaintiffs amended their complaint to add Microsoft as a defendant alleging that Microsoft knew about and assisted OpenAI in its infringement of the plaintiffs’ copyrights.

Because it is the end of the year, here are five “things to look for in 2024” in this case (and others like it):

What will defendants argue on fair use and how will the Supreme Court’s 2023 decision in Goldsmith impact this argument? (In 2023 the SCOTUS ruled that Andy Warhol’s manipulation of a photograph by Lynn Goldsmith was not transformative enough to qualify as fair use.)
Does the fact that the output of platforms like ChatGPT isn’t copyrightable have any impact on the fair use analysis? The whole idea behind fair use is to encourage subsequent creators to build on the work of earlier creators, but what happens to this analysis when the later “creator” is merely a computer doing what it was programmed to do?
Will the fact that OpenAI recently inked a deal with Axel Springer (publisher of Politico and Business Insider) to allow OpenAI to summarize its news articles as well as use its content as training data for OpenAI’s large language models affect OpenAI’s fair use argument?
What impact, if any, will this and other similar cases have on the business model for AI? Big companies and venture capital firms have invested heavily in AI, but if courts rule they must pay authors and other creators for their copyrighted works it dramatically changes the profitability of this model. Naturally, tech companies are putting forth numerous arguments against payment, including how little each individual creator would get considering how large the total pool of creators is, how it would curb innovation, etc. (One I find compelling is the idea that training a machine on copyrighted text is no different from a human reading a bunch of books and then using the knowledge and sense of style gained to go out and write one of their own.)
Is Microsoft, which sells (copyrighted) software, ok with a competitor training its platform on copyrighted materials? I’m guessing that’s probably not ok.

These are all big questions with a lot at stake. For good and for ill, we live in exciting times, and in the arena of copyright and IP law I guarantee that 2024 will be an exciting year. See you then!

November 21, 2023

Can NO FAKES be for Real?

This week, I’m taking a break from talking about court cases and instead focusing on a draft bill aimed at creating a federal right of publicity that was introduced in October by a bipartisan group of Senators. A quick refresher: the right of publicity allows an individual to control the use of their voice, and laws or cases governing this right exist in about two-thirds of the states.

Now, with generative AI and “deepfake” technology, celebrities and entertainment companies are pushing for greater protection against the creation of unauthorized digital replicas of a person’s image, voice, or visual likeness. And the Senate, it appears, is responding, raising concerns among digital rights groups and others about First Amendment rights and limits on creative freedom.

Before diving into the specifics of the bill and its potential implications, I want to step back and talk about the underlying reasons for intellectual property laws. These laws are the subject of entire law school classes (I took several of them), but I can quickly summarize two fundamental reasons why they exist. The first is to encourage artistic works and inventions, an idea that can be found in the U.S. Constitution. The idea is that allowing creators (in the case of copyright law) and inventors (in the case of patent law) to exclusively reap the economic benefits of their work will incentivize people to make art and invent useful things. Notably, both copyrights and patents are in effect for a limited amount of time: for patents, 20 years from the date of the application, while copyrights run for the life of the creator plus 70 years (note that length; it’s going to come up again).

The second reason is to prevent consumer confusion. This is the central concern of trademark and unfair competition laws, which are intended to ensure that no one other than the company associated with a particular good or service is selling that good or service.

The idea behind the right of publicity (you can read more about it in the context of generative AI here), includes a dash of both of these rationales. It ensures that individuals can profit from their investment in their persona by preventing others from using their name, likeness, voice, etc., without their permission. It also prevents brands from claiming someone endorsed a product without that person’s consent.

With generative AI and the ease with which anyone can now create a digital replica of a celebrity to endorse a product or perform a song, artists and entertainment companies are worried that the current patchwork of state laws isn’t enough. Hence, the Nurture Originals, Foster Art, and Keep Entertainment Safe Act of 2023 or the NO FAKES Act of 2023, which, if enacted, would create a federal right of publicity. (A side question: in hiring staff, do Members of Congress test job applicants’ ability to come up with wacky bill titles that can be made into acronyms? Because this one certainly took some legitimate skill.)

The bill protects against the creation of an unauthorized “digital replica,” which the NO FAKES Act describes as: “a newly created, computer-generated, electronic representation of the image, voice, or visual likeness of an individual that is [nearly indistinguishable] from the actual image, voice, or visual likeness of an individual; and is fixed in a sound recording or an audiovisual work in which that individual did not actually perform or appear.”

In other words, NO FAKES bars using a computer to create an audiovisual work or a recording that looks or sounds very much like a real person when that person has not consented. This proposed right bars the creation of a digital replica during a person’s lifetime and for 70 years after death (the same as existing copyright laws). In the case of a dead person, the person or entity that owns the rights to the deceased’s publicity rights (often, the deceased’s heirs) would have to consent to the creation of a digital replica.

If NO FAKES is passed, anyone who creates an unauthorized digital replica can be sued by the person who controls the rights; the rights holder can also sue anyone, like a website or streaming platform, who knowingly publishes, distributes, or transmits a digital replica without consent. This is true even if the work includes a disclaimer stating the work is unauthorized.

That said, the Act as currently drafted does include some exceptions intended to protect the First Amendment. For example, NO FAKES states that it is not a violation of the Act to create a digital replica that is used as part of a news broadcast or documentary or for purposes of “comment criticism, scholarship, satire, or parody.”

Some other things to note:

The right to control the creation of a digital replica does not extend to images that are unaccompanied by audio.
The draft bill states that the right to control digital replicas “shall be considered to be a law pertaining to intellectual property for the purposes of section 230(e)(2) of the Communications Act of 1934. This means that Internet service providers cannot rely on Section 230 to avoid liability.

Now, it is likely the draft will have undergone significant amendments and revisions if and when it is passed. As mentioned above, digital rights groups and others worry that the right of publicity can be used to litigate against speech protected by the First Amendment, as public figures in the past have tried when they don’t like something that has been said about them in the media.

To me, the Act seems a bit suspicious. You may notice I’ve stressed how the Act extends protection against digital replicas to 70 years post-mortem, the same exact length as copyright protection. Isn’t this expansiveness a bit much considering the current state of play is no federal right of publicity at all? The extreme length of the proposed protection, coupled with the Act eliminating the use of disclaimers as a shield for liability, suggests NO FAKES is less about protecting the public and more designed to prolong celebrities’ and entertainment companies’ abilities to profit. After all, the right to publicity created in the NO FAKES Act can be sold by an actor or their heirs to a company like, say, a movie studio… that could then, in theory, continue to feature digital replicas of the aged or deceased actor in their films unchallenged for seven decades after death. Thelma and Louise 4: Back From the Abyss is coming, and Brad Pitt won’t look a day over 30.

Good, perhaps, for Brad Pitt. The rest of us, maybe not.