Key Takeaways
The U.K. government plans to create a “National Data Library” that could be used to monetize public records, including health and education data.
Discussing the topic with the Financial Times, Technology Secretary Peter Kyle said he wasn’t “squeamish” about commercializing public data, as long as there are civic benefits.
Because civic records are generally subject to the highest privacy requirements, they represent one of the last untapped reservoirs of AI training data.
But now, the U.K. government plans to open the country’s public data sets to businesses and researchers building AI tools.
In January, Kyle presented the AI Opportunities Action Plan to Parliament, calling for the public sector to collect more high-quality data and to “responsibly unlock” existing resources.
More recently, the government launched a new program to equip data scientists working in the public sector with new AI skills. At the same time, efforts to boost public sector adoption of AI tools have become increasingly central to the government’s ongoing efficiency drive.
The AI Action Plan points to examples of public databases from which the National Data Library could draw inspiration.
These include the U.K. Biobank, a trove of biological samples, physical measurements, body and brain imaging data, bone density data, activity tracking and questionnaire answers collected from half a million people.
In another example, Kyle pointed to a new Department for Education initiative to use student data to build AI tools for the classroom.
While there have been some efforts to organize public data for AI training in the past, the idea of charging for access is more novel.
The U.K. Biobank provides a picture of what this might look like for health data. For three years of access to the platform, users pay up to £9,000, with discounts available for student researchers and customers from lower-income countries.
Kyle acknowledged that similar anonymized health records “will be used for public benefit” as part of the proposed National Data Library. “Part of that will be the commercialization of it for scientific endeavor,” he added.
Although he didn’t provide concrete details, the technology secretary suggested the government could start selling public data within a decade.
Meanwhile, the AI action plan said the government should identify at least five “high-impact public datasets” it can rapidly make available.
Alongside the nation’s health and education systems, British cultural institutions could also provide a gold mine of unique data.
For example, the government’s action plan calls for a body like the BBC and British museums to contribute media assets that could be licensed to AI developers.