5 min read

OpenAI Accuses New York Times of Hiring Hacker: The Ongoing Legal Saga Continues

Last Updated February 28, 2024 3:13 PM

OpenAI alleges the New York Times hired a hacker and violated their terms of service. | Credit:Gary Hershorn, Getty Images.

Key Takeaways

The lawsuit centers on allegations that OpenAI used articles from The New York Times to train their AI chatbots.
OpenAI claims that The New York Times hired an individual to deliberately “hack” OpenAI’s products.
This ongoing conflict underscores broader concerns over AI’s access to copyrighted material.

OpenAI claims that The New York Times hired an individual to deliberately “hack” OpenAI’s products, violating its terms of use.

According to OpenAI, this individual conducted tens of thousands of attempts to exploit a bug in OpenAI’s systems using deceptive prompts, to generate specific, anomalous results. These efforts, OpenAI alleges, were far beyond any normal or intended use of its products.

A Violation of OpenAI’s Terms of Service

In a motion filed on Monday, the generative AI company criticized The New York Times for allegedly undertaking extraordinary measures to extract these passages, actions it claims are in clear violation of OpenAI’s terms of service.

“The truth, which will come out in the course of this case, is that the Times paid someone to hack OpenAI’s products. It took them tens of thousands of attempts to generate the highly anomalous results that make up Exhibit J to the Complaint. They were able to do so only by targeting and exploiting a bug (which OpenAI has committed to addressing) by using deceptive prompts that blatantly violate OpenAI’s terms of use” the filing reads.

In defense of its technology and its user base, OpenAI argues that the methods employed by the New York Times do not reflect the typical use of its products and should not cast a shadow over the integrity of AI technology or its impact on the news industry and copyright law.

OpenAI accuses The New York Times of trying to monopolize facts and the rules of language, emphasizing that the use of text data in training AI models for generating new content is a transformative use that does not infringe on copyright.

Neither The New York Times nor OpenAI responded to a request for comment.

A Timeline of the Lawsuit

The lawsuit between The New York Times, OpenAI, and Microsoft arises from allegations that OpenAI and Microsoft used millions of the New York Times articles without permission to train their AI chatbots.

On December 27 2023 The New York Times accused both companies of infringing on its copyrighted works to create products that could potentially replace the newspaper, thus “stealing” its audience. The lawsuit highlights a significant conflict over the use of copyrighted material for training AI, with the New York Times arguing there’s nothing transformative about using its content without payment for such purposes.

OpenAI and Microsoft defend their actions under the “fair use” doctrine, which allows the unlicensed use of copyrighted material under certain conditions.

The New York Times seeks damages estimated in the billions and demands the destruction of chatbot models and training sets incorporating its material. A Machine Learning professor shared his thoughts on X , on the difficulty OpenAI would face in “untraining” its LLM.

Imagine "untraining" a Large Language Model on specific content.

The New York Times doesn't want its content in ChatGPT. OpenAI would have to retrain its models from scratch. Removing any content from their models would cost them millions of dollars.

I just read a new paper… pic.twitter.com/1PQ8f4zbMd

— Santiago (@svpino) January 24, 2024

OpenAI’s Copyright Doctrine Defense

In the recent filing, OpenAI has declared that its Large Language Model ChatGPT is not a substitute for a subscription to The New York Times. The AI company also claimed that the Times lawsuit is baseless:

“For good reason, there is a long history of precedent holding that it is perfectly lawful to use copyrighted content as part of a technological process that (as here) results in the creation of new, different, and innovative products.12 Established copyright doctrine will dictate that the Times cannot prevent AI models from acquiring knowledge about facts, any more than another news organization can prevent the Times itself from re-reporting stories it had no role in
investigating.”

OpenAI’s response to the Times lawsuit emphasizes the tension between the news industry and generative AI firms, touching on critical aspects like collaboration with publishers, the challenges of focusing on content regurgitation, the disputed limits of fair use, and the debate over an opt-out option for copyright content.

Meanwhile, Reddit has entered into a landmark partnership with Google that will enable the search engine to use the Reddit Data API.

Following the announcement of Reddit’s deal to sell access to user data to Google for $60M for training AI, it looks like Tumblr & Wordpress are in the process of signing similar deals with OpenAI and Midjourney.

Access to training data & GPUs is going to be key in AI wars. pic.twitter.com/Az7A3tKRqS

— Dare Obasanjo🐀 (@Carnage4Life) February 28, 2024

Was this Article helpful? Yes No

OpenAI Accuses New York Times of Hiring Hacker: The Ongoing Legal Saga Continues

A Violation of OpenAI’s Terms of Service

A Timeline of the Lawsuit

OpenAI’s Copyright Doctrine Defense

Samantha Dunn

Reddit and Google Forge $60 Million AI Partnership Following Data Access Dispute

OpenAI Claims New York Times Wanted Partnership and ‘Tricked’ GPT in Copyright Feud

Journalism Fights Back – Why the New York Times’ AI Lawsuit Matters