Key Takeaways
OpenAI has signed a deal with News Corp to incorporate news content from prestigious publications like The Wall Street Journal and The Times into its AI platform. The move highlights the growing hunger artificial intelligence (AI) companies have for high-quality data.
This agreement follows similar deals with the Financial Times and Axel Springer. But it may also represent a risk for the media industry, already on its knees due to ad revenue drop over the years.
OpenAI has formed a long-term partnership with Condé Nast, a renowned publishing house. This collaboration will integrate content from popular Condé Nast brands like Vogue, Wired, and The New Yorker into OpenAI’s AI products, including ChatGPT and SearchGPT.
Financial details remain confidential. These partnerships with media outlets provide OpenAI with access to vast archives of text data, a crucial resource for training and improving its language models.
SearchGPT, OpenAI’s recently launched AI-powered search engine, will now incorporate real-time information from Condé Nast articles. This integration aims to deliver more comprehensive and accurate search results, challenging the dominance of traditional search engines like Google.
Brad Lightcap, OpenAI’s COO, emphasized the company’s dedication to collaborating with publishers to ensure AI-driven news discovery remains accurate and respectful of quality journalism. Roger Lynch, CEO of Condé Nast, expressed optimism about the partnership, highlighting its potential to compensate publishers for the revenue lost to technology companies in recent years.
ChatGPT’s developer OpenAI has signed a deal to incorporate news content from the Wall Street Journal, the New York Post, the Times, and the Sunday Times into its artificial intelligence platform. Both companies announced the deal in separate press releases on Wednesday. The financial terms of the deal remain undisclosed, but sources revealed it may be worth around $250 million.
This agreement will grant OpenAI access to current and archived content from News Corp’s publications. Lachlan Murdoch is News Corp’s chair, and his father, Rupert Murdoch, serves as chairman emeritus after stepping down from his roles at News Corp and Fox News last year.
This partnership follows OpenAI’s recent agreement with the Financial Times. This allows the AI giant to license FT content to develop its AI models. As part of this collaboration, ChatGPT users can access select attributed summaries, quotes, and rich links to FT journalism in response to relevant queries.
Additionally, earlier this year, the Financial Times became a customer of ChatGPT Enterprise, purchasing access for all its employees. This move aims to ensure that FT teams are well-versed in the technology. And it can leverage the creative and productivity benefits of OpenAI’s tools. OpenAI also signed a similar agreement earlier this year with Axel Springer, the parent company of Business Insider and Politico.
OpenAI’s recent activities highlight how online information – from news stories and fictional works to message board posts, Wikipedia articles, computer programs, photos, podcasts, and movie clips – has become the lifeblood of the booming AI industry. Creating innovative AI systems relies heavily on vast data to teach these technologies to generate text, images, sounds, and videos that resemble human creations.
The sheer volume of data is crucial. Leading chatbot systems have learned from digital text collections totaling up to three trillion words, roughly double the number of words stored in Oxford University’s Bodleian Library, which has collected manuscripts since 1602. High-quality information, such as published books and articles meticulously written and edited by professionals, is especially valuable to AI researchers.
The urgency of the situation is increasing. According to Epoch, a research institute, tech companies could exhaust the high-quality data available on the internet by 2026. These companies are consuming data faster than it is being produced.
Some tech companies are now developing “synthetic” information in their quest for new data. Unlike organic data created by humans, this synthetic data consists of text, images, and code produced by AI models. Essentially, the systems learn from their own generated content.
OpenAI stated that each AI model “has a unique data set that we curate to help their understanding of the world and remain globally competitive in research.” Google noted that its AI models “are trained on some YouTube content,” allowed under agreements with YouTube creators, and that the company did not use data from office apps outside of an experimental program. Meta emphasized its “aggressive investments” to integrate AI into its services, utilizing billions of publicly shared images and videos from Instagram and Facebook to train its models.
After years of dealing with tech giants like Meta and Alphabet, publishers are increasingly wary. These firms dominate online ad revenue, while print turnover has declined for many publications. Their inconsistent payments and algorithm changes have also hurt media companies.
Some media executives regret not negotiating harder in the past and are taking a tougher stance on AI, despite the risk of missing out on potential licensing revenue. “It’s in my interest to find agreements with everyone,” said Le Monde CEO, Louis Dreyfus. “Without an agreement, they will use our content without any benefit for us.”
Other executives are hesitant to discuss deals until the issue of using their content to train AI models sees a resolution. CEO William Lewis said the Washington Post is also seeking significant AI partnerships.
In late 2023, OpenAI and Axel Springer signed a multi-year licensing agreement. This allows OpenAI to use articles from Business Insider and Politico. Companies didn’t consult reporters beforehand. The deal was briefly mentioned at a Business Insider meeting. Although significant, it is not unique, as AI companies often scrape the internet for data without licensing, leading to lawsuits.
However, several outlets, including BBC and The New York Times, have blocked OpenAI’s web crawler to prevent scraping.
The impact of these agreements is debatable. Compensation for individual writers may be minimal. Nick Diakopoulos, a computational journalism professor at Northwestern University, notes that most artists receive minimal compensation. Only a few top creators earning substantial payments.
These deals benefit news outlets and journalists. However, the long-term impact of AI services could still threaten the media industry. If readers turn to AI tools for article summaries, it could reduce direct clicks to publications. And this may undermine traffic-driven digital ad revenue and creating a new media crisis.