Key Takeaways
When European consumers were first introduced to ChatGPT in 2022, authorities quickly moved to investigate whether the new chatbot complied with EU privacy regulations.
As those investigations move forward, OpenAI could be facing a number of challenges related to its handling of personal data.
Across the EU, authorities have opened investigations into OpenAI concerning ChatGPT and whether it breaches the EU’s general data protection regulation (GDPR).
The first to move was the Italian privacy watchdog, which acted to block ChatGPT in March 2023. While the restrictions were later lifted, in January this year, the regulator concluded that OpenAI had failed to provide sufficient privacy guarantees.
With authorities in Austria and Spain also receiving complaints about the chatbot, the European Data Protection Board (EDPB) launched a special task force to investigate the matter.
In a preliminary report published by the task force last month, the EDPB highlighted some of the privacy issues surrounding the technology.
Many of the concerns highlighted by the EDPB arise from the AI training process, which relies on vast amounts of data scraped from the internet.
In its latest report, the EDPB has proposed that OpenAI should ensure that certain categories of data aren’t collected. For example, it suggested measures could be implemented to prevent data being collected from public social media profiles (which would be likely to contain personal data as defined by GDPR).
“Furthermore, measures should be in place to delete or anonymize personal data that has been collected via web scraping before the training stage,” it added.
Transparency is a key requirement under GDPR, especially when it comes to informing data subjects about how their data is used.
The EDPB warned that ChatGPT risked violating Article 14 of the GDPR, which requires notifying people when their personal data is collected and processed.
However, it acknowledged that OpenAI may be able to invoke a clause in Article 14 that absolves firms of responsibility if “the provision of such information proves impossible.”
The EDPB’s intervention comes as OpenAI is under mounting pressure to be more transparent about where it gets its training data from. For instance, when asked about the videos used to train SORA, OpenAI CTO Mira Murati was evasive.
The principle of data accuracy, as stated in Article 5 of the GDPR, demands that personal data be accurate and up to date.
In ChatGPT’s case, this principle applies to both input data and, problematically, generated outputs.
While the accuracy of AI models is continuously improving, hallucinations remain a problem. The EDPB has said that it is up to chatbot developers to be transparent about the potential for information to be inaccurate. Moreover, “although [these measures] are beneficial to avoid misinterpretation of the output of ChatGPT, they are not sufficient to comply with the data accuracy principle.”
In April, the privacy campaign group noyb filed a complaint against OpenAI with the Austrian data protection authority.
The complaint takes issue with OpenAI’s inability to correct inaccuracies in OpenAI outputs or to provide an explanation for where the data comes from.
“Making up false information is quite problematic in itself. But when it comes to false information about individuals, there can be serious consequences,” stated noyb lawyer Maartje de Graaf.
“It’s clear that companies are currently unable to make chatbots like ChatGPT comply with EU law, when processing data about individuals. If a system cannot produce accurate and transparent results, it cannot be used to generate data about individuals. The technology has to follow the legal requirements, not the other way around,” she stressed.
GDPR grants data subjects several rights, including access, rectification, and erasure of their personal data.
The EDPB noted that OpenAI must ensure these rights are upheld by providing mechanisms for users to access their data, request corrections, or have their data deleted.
The task force emphasized that even though OpenAI allows users to opt out of using their content for training, it must also ensure that users are adequately informed and their rights are protected throughout the data processing lifecycle