Home / News / Technology / OpenAI Signs 5 Year Deal with News Corp to Access Its Data – How This Impacts Journalism
Technology
5 min read

OpenAI Signs 5 Year Deal with News Corp to Access Its Data – How This Impacts Journalism

Published May 23, 2024 1:06 PM
Giuseppe Ciccomascolo
Published May 23, 2024 1:06 PM

Key Takeaways

  • OpenAI has struck a deal with News Corp to access its large media database.
  • This is the latest partnership Sam Altman’s company announced with a media giant.
  • OpenAI’s spout of recent deals with big media corporations highlights the need for content in training AI models.

OpenAI has signed a deal with News Corp to incorporate news content from prestigious publications like The Wall Street Journal and The Times into its AI platform. The move highlights the growing hunger artificial intelligence (AI) companies have for high-quality data.

This agreement follows similar deals with the Financial Times  and Axel Springer. But it may also represent a risk for the media industry, already on its knees due to ad revenue drop over the years.

OpenAI’s Deals With News Giants

ChatGPT’s developer OpenAI has signed a deal to incorporate news content from the Wall Street Journal, the New York Post, the Times, and the Sunday Times into its artificial intelligence platform. Both companies announced  the deal in separate press releases on Wednesday. The financial terms of the deal remain undisclosed, but sources  revealed it may be worth around $250 million.

This agreement will grant OpenAI access to current and archived content from News Corp’s publications. Lachlan Murdoch is News Corp’s chair, and his father, Rupert Murdoch, serves as chairman emeritus after stepping down from his roles at News Corp and Fox News last year.

This partnership follows OpenAI’s recent agreement with the Financial Times, which allows the AI giant to license FT’s content to develop its AI models. ChatGPT users can access select attributed summaries, quotes, and rich links to FT journalism in response to relevant queries as part of this collaboration.

Additionally, earlier this year, the Financial Times became a customer of ChatGPT Enterprise, purchasing access for all its employees. This move aims to ensure that FT teams are well-versed in the technology and can leverage the creative and productivity benefits of OpenAI’s tools. OpenAI also signed a similar agreement earlier this year with Axel Springer, the parent company of Business Insider and Politico.

AI Companies’ Hunger For Data

OpenAI’s recent activities highlight how online information – from news stories and fictional works to message board posts, Wikipedia articles, computer programs, photos, podcasts, and movie clips – has become the lifeblood of the booming AI industry. Creating innovative AI systems relies heavily on vast data to teach these technologies to generate text, images, sounds, and videos that resemble human creations.

The sheer volume of data is crucial. Leading chatbot systems have learned from digital text collections totaling up to three trillion words, roughly double the number of words stored in Oxford University’s Bodleian Library, which has collected manuscripts since 1602. High-quality information, such as published books and articles meticulously written and edited by professionals, is especially valuable to AI researchers.

The urgency of the situation is increasing. According  to Epoch, a research institute, tech companies could exhaust the high-quality data available on the internet by 2026. These companies are consuming data faster than it is being produced.

Some tech companies are now developing “synthetic” information in their quest for new data. Unlike organic data created by humans, this synthetic data consists of text, images, and code produced by AI models – essentially, the systems learn from their own generated content.

OpenAI stated  that each AI model “has a unique data set that we curate to help their understanding of the world and remain globally competitive in research.” Google noted  that its AI models “are trained on some YouTube content,” allowed under agreements with YouTube creators, and that the company did not use data from office apps outside of an experimental program. Meta emphasized its “aggressive investments” to integrate AI into its services, utilizing billions of publicly shared images and videos from Instagram and Facebook to train its models.

The Impact On Journalism

After years of dealing with tech giants like Meta and Alphabet, publishers are increasingly wary. These firms dominate online ad revenue, while print turnover has declined for many publications. Their inconsistent payments and algorithm changes have also hurt media companies.

Some media executives regret not negotiating harder in the past and are taking a tougher stance on AI, despite the risk of missing out on potential licensing revenue. “It’s in my interest to find agreements with everyone,” said  Le Monde CEO, Louis Dreyfus. “Without an agreement, they will use our content without any benefit for us.”

Other executives are hesitant to discuss deals until the issue of using their content to train AI models sees a resolution. CEO William Lewis said  the Washington Post is also seeking significant AI partnerships.

In late 2023, OpenAI and Axel Springer signed a multi-year licensing agreement, allowing OpenAI to use articles from Business Insider and Politico. Companies didn’t consult reporters beforehand. The deal was briefly mentioned at a Business Insider meeting. Although significant, it is not unique, as AI companies often scrape the internet for data without licensing, leading to lawsuits.

However, several outlets, including BBC and The New York Times, have blocked  OpenAI’s web crawler to prevent scraping.

The impact of these agreements is debatable. Compensation for individual writers may be minimal. Most artists receive very little from compensation programs, and only a small proportion of the most popular creators receive substantial payments, said Nick Diakopoulos, a computational journalism professor at Northwestern University.

Even if these deals benefit news outlets and journalists, the long-term impact of AI services could still threaten the media industry. If readers turn to AI tools for article summaries, it could reduce direct clicks to publications, undermining traffic-driven digital ad revenue and creating a new media crisis.

Was this Article helpful? Yes No