Key Takeaways
Since mid-2024, AI developers have increasingly equipped their models with the ability to use computers as the technology evolves from chatbots that speak to agents that act.
Although Google and Anthropic were the first to release computer-using agents, industry leader OpenAI has just dropped “Operator,” setting new performance standards for the emerging class of AI.
Equipped with its own dedicated web browser, Operator can view webpages and interact with them by typing, clicking, and scrolling, opening up a whole host of new possibilities compared to previous chat-based OpenAI platforms.
According to OpenAI’s announcement , “Operator can be asked to handle a wide variety of repetitive browser tasks such as filling out forms, ordering groceries, and even creating memes.”
As the company put it: “The ability to use the same interfaces and tools that humans interact with on a daily basis broadens the utility of AI, helping people save time on everyday tasks while opening up new engagement opportunities for businesses.”
The launch of Operator points to growing competition in the computer-using agent sector, where it is vying for a position with Claude Computer Use and Google’s Project Mariner—although all three are technically still in the beta stages.
Performance benchmarks reveal that OpenAI’s entry has set a new bar.
The new agent scored 38.1% on the OSWorld computer-interaction-focused benchmark, surpassing Anthropic’s Claude Computer Use, which only managed 22.0%.
On WebVoyager, which assesses agents’ ability to navigate and retrieve data from the internet, Operator scored 87%, pulling ahead of Google Mariner’s 83.5%. Meanwhile, Anthropic’s Claude Computer Use lagged behind by just 56%.
The ability of AI agents to interact with computers holds huge potential for both everyday AI users and commercial or industrial applications.
Foreseeing potential use cases, OpenAI said it collaborated with DoorDash, Instacart, OpenTable, Uber, and others to ensure Operator addresses real-world needs.
Meanwhile, the new platform lets users personalize their workflows by adding custom instructions such as setting website preferences.
As the technology evolves, perhaps the greatest potential for AI computer use lies in its integration with popular smart assistants like Siri and Alexa.
These platforms have traditionally integrated third-party applications on an app-by-app basis. But if they are enhanced with Operator-style functionality, the smart assistants of tomorrow will be able to interact with a potentially limitless array of apps and services.