Home / News / Technology / OpenAI CTO’s Evasive Response Raises Questions About Sora’s Data Sources
Technology
3 min read

OpenAI CTO’s Evasive Response Raises Questions About Sora’s Data Sources

Published
Samantha Dunn
Published

Key Takeaways

  • OpenAI CTO Mira Murati says Sora was trained on publicly available data.
  • When questioned further, Murati was unable to clarify if Sora uses data from social media.
  • Her apparent ignorance on the question of data training sources places OpenAI under serious scrutiny.

OpenAI CTO Mira Murati opened up OpenAI to legal scrutiny when she said Sora was trained on publicly available and licensed data. When pressed further on the exact sources Murati was unable to confirm where this data came from.

The FTC is currently investigating OpenAI with a specific focus on its training dataset.

CTO Avoids Data Training Question

Murati’s vague response to a simple question from The Wall Street Journal’s Joanna Stern  implies that the CTO of OpenAI was either lying or unable to provide a simple answer to a question well within her knowledge base.

The Wall Street Journal interview with Murati provides an overview of Sora, OpenAI’s new text-to-video AI model, that creates highly realistic scenes. OpenAI’s CTO shared that this technology is still in the research phase and not yet optimized for consumers.

However, Mira Murati’s inability to detail the data sources for Sora’s training not only raises eyebrows regarding compliance with European regulations but aligns with broader concerns about privacy and ethical AI development.

OpenAI FTC Investigation

The FTC’s focus on sensitive personal data underscores the critical nature of transparency in AI training processes to prevent the misuse of personal information and ensure that AI systems are developed responsibly.

Given the FTC’s ongoing investigation  and the EU’s stringent requirements, OpenAI may need to disclose more detailed information about Sora’s training datasets or face regulatory actions.

Murati’s non-specific response could be seen as a missed opportunity to demonstrate OpenAI’s commitment to ethical AI development and compliance with global standards.

If Sora is classified under the EU’s high-risk AI category, the lack of clear information about its training data could indeed lead to non-compliance with the EU AI Act’s transparency obligations. This could have serious legal and reputational consequences for OpenAI.

Growing Legal Woes

Among the growing legal battles OpenAI faces, Elon Musk has recently a lawsuit against OpenAI and its CEO Sam Altman based on its move away from the non-profit model upon which it was founded.

The New York Times initiated legal action against OpenAI and its investor, a technology giant, on grounds of copyright infringement.

OpenAI has always contested these allegations, stating that ChatGPT’s utilization of the plaintiffs’ works falls under fair use.

The legal battles OpenAI faces are growing, initiated by competitors, media, and federal investigators. The recent gaff by OpenAI provides regulators with additional material for their investigations.

Was this Article helpful? Yes No

Samantha Dunn

Samantha started as a traditional writer and journalist before falling down the Web3 rabbit hole. She now explores the ways in which emerging technology is impacting economies, industries, and the individual.
See more