Artificial intelligence may be evolving at lightning speed, but according to Sapien co-founder Trevor Koverko, the next real leap won’t come from bigger models; it’ll come from better data.
In a conversation with CCN, Koverko discussed how the quality of data feeding AI systems has become a critical frontier, and why decentralization and human participation may hold the key to solving it.
As large language models (LLMs) mature, he argues, simply adding more parameters offers diminishing returns. The challenge now is ensuring that AI learns from diverse, accurate, and ethically sourced data, something that can’t be achieved by automation alone.
That’s where Sapien, his human data network, comes in. By blending blockchain incentives with human-in-the-loop data creation, the company aims to build what Koverko calls “a new category of gig work” for the AI era.
“The next frontier in AI is not just bigger models, but better data,” Koverko said. “Early on, any type of data was better than no data. That’s how models improved so fast in the beginning. But over time, there are diminishing returns. Now you need better quality data, or new kinds of data, or even private data that doesn’t exist yet.”
He described the current environment as a “moving bar,” where the definition of quality continually evolves.
“The industry is asking: where does new data come from? What is high-quality data? Do humans still play a role in creating and structuring it? These are open questions, and they’re evolving fast.”
That evolution is happening across verticals. “We started with text-based data, which was great,” he said.
“Now we’re moving to other modalities. The newest, most complicated, and exciting is 3D and 4D data, such as robotics and real-world data. That’s a whole new industry, because that data doesn’t even exist yet. We had decades of text data stored online without realizing it would be used to train future AI, but that’s not true for robotics data. We have to build it from scratch.”
For Koverko, this means AI’s progress depends on both higher-fidelity data and more diverse data sources.
“We’re moving up the quality bar, more accuracy, more realism, while also moving across industries.”
One of Sapien’s key ideas is that decentralization can solve the bottlenecks in data production.
“Quality is the result,” Koverko said. “If you don’t have it, it’s garbage in, garbage out. And now that standards are higher, and customer expectations are higher, we have to rethink how we produce data.”
Sapien’s approach combines human intelligence with blockchain-based incentives. “We’re building from first principles, asking what a global network of humans looks like and what modern technologies we can use to make it seamless and diverse,” he said.
“The quality of data is directly correlated to the diversity of the humans producing it. If you don’t have diversity across gender, age, socioeconomic background, you end up with biased datasets and biased machines.”
The human-powered model has already attracted prominent names, including Alibaba and the United Nations, but convincing large institutions to trust a decentralized system hasn’t been easy. “It ties back to our idea of ‘brutal quality,’” Koverko explained.
“That’s what we call our protocol, an open system that matches the right human with the right dataset and, more importantly, incentivizes them to do good work.”
The key is real-time feedback and aligned incentives. “We reward contributors for good work instantly,” he said.
“We also make sure they have skin in the game, something to lose if they do bad work. It’s a carrot-and-stick model, and it’s working. People want to do good work if you give them the right structure.”
If the model scales, Koverko believes it could eliminate the need for centralized data-labelling centers.
“Today, most major labs still operate out of physical locations. But if we can prove this works, the benefits compound: contributors earn more because there’s no middleman, enterprises pay less for better data, and the overall output is higher quality.”
He compared the potential shift to the rise of ride-sharing. “It’s cliché to use Uber as an example, but it fits. Before Uber, the idea of millions of people earning income through a mobile app didn’t exist. We’re creating a new category of gig work. A few years ago, there were no full-time human data labelers working globally through decentralized systems; now there can be.”
Koverko believes that human-powered data work could unlock new opportunities worldwide. “Humans are realizing who’s really in charge of AI,” he said.
“These models need humans more than humans need AI. And it’s exciting because this kind of work can be done by anyone, anywhere. You don’t need to drive a car or own a delivery route. If you have an Android phone and an internet connection, you can potentially earn a living wage.”
He sees that as an opportunity to reduce poverty and create inclusion. “Someone in rural Manila, someone disabled, or someone who doesn’t speak English; they can still work. It’s a new kind of labor market, one that’s digital, global, and open.”
The company has faced challenges, including what Koverko called “civil attacks,” or attempts by bad actors to game the system. But he takes them as a sign of progress.
“When people try to cheat, it means they care enough to engage,” he said. “It’s a cat-and-mouse game. But every time someone tries, we learn and make the system more robust. They actually help us improve.”
When asked about human data ownership, Koverko said it’s both a technical and a cultural issue. “We’re a Web3 company at our core,” he explained.
“We believe humans should own their data. There are other great protocols, such as Sentient, working toward similar goals. We’re all trying to build a future where power is distributed and people are rewarded for their contributions.”
Still, Sapien’s focus remains pragmatic. “At the end of the day, people want to make money and work,” he added.
“Like Uber drivers, they might not dream about it, but it gives them income flexibility. That’s the same spirit we’re bringing to data. We want to empower people while improving the models.”
Koverko said regulation is less of a threat than over-centralization. “Of course, we follow laws like GDPR,” he said.
“But the real danger is draconian AI laws that slow innovation and protect incumbents. Sometimes it’s big companies lobbying to make it harder for startups to compete.”
Trust, he added, is everything. “When you work with Fortune 100 clients, you have to treat their data as sacred. It’s not just about compliance; it’s about reputation. Customers need to trust that we protect their information and build responsibly.”
In five years, Koverko hopes Sapien will become the global standard for AI data. “Scale means quality,” he said.
“If we can grow our contributor network and our customer base, we can deliver the best data to enterprises and the best income opportunities to humans. That’s how we win.”