When Google unveiled Gemini in December, it announced that the advanced “multimodal” AI model would be launched in 3 sizes: an efficient “Nano” model for on-device tasks, a “Pro” model for heavier workloads and an “Ultra” version for the most complex requirements.
While Gemini Pro went live in December, powering the company’s Bard chatbot, there has been no official announcement on a launch date for Gemini Ultra. An unpublished changelog suggests the most powerful version of Gemini will be made available this week. To coincide with the launch, Google also appears to be retiring the Bard brand, renaming its chatbot to reflect the AI running under the hood.
According to the leaked announcement, “in the Gemini era” Google will offer a more streamlined user interface to reduce visual distractions, improve legibility, and simplify navigation.
Upon the launch of Gemini Ultra, Google’s “best family of AI models” will be available to all users across supported countries and languages.
“To better reflect this commitment, we’ve renamed Bard Gemini,” the document states.
Gemini Ultra will be available through a new paid-for “Gemnini Advanced” service equivalent to ChatGPT Pro. The free version will continue to run Gemini Pro.
In the competition for users, Bard lags behind the market leader: OpenAI’s ChatGPT. But while it may be less popular than its rival at the moment, Google’s latest efforts suggest it is placing its bets on multimedia services defining the next wave of generative AI.
When OpenAI debuted GPT-4 Turbo in November, the introduction of a “with vision” module established multimodal AI as the next key battle line among Big Tech AI developers.
Previously, the GPT family consisted of pure language models, while computer vision was dealt with by DALL-E models. With GPT-4 Turbo, however, visual and linguistic AI capabilities are rolled into one.
When Google entered the fray with Gemini a month later, it set the stage for a showdown between the world’s 2 most dominant AI developers, with each one attempting to outperform the other in areas such as visual reasoning and text-to-image generation.
When it unveiled Gemini in December, Google made much of its superior performance compared to GPT-4, which it scored higher than on 30 out of 32 benchmarks.
However, the company soon came under fire for misleading marketing after it emerged that a promotional video didn’t show Gemini’s functionalities in a live usage environment.
Instead, Google’s head of Deep Learning explained that the video was made “to inspire developers” and presented an idealized version of the kind of multimodal user experience Gemini could deliver, but not straight out of the box.
Despite the backlash against Google’s PR slip-up, Gemini’s performance when benchmarked against GPT-4 shouldn’t be understated. As the AI researcher Alberto Romero has observed , the latest model represents “the first time in four years that anyone has taken the lead from OpenAI.”
Nevertheless, he added the caveat that Google only benchmarked Gemini against the standard GPT-4 model, not the more advanced GPT-4 Turbo.
Like many observers of Google and OpenAI’s ongoing technology race, Romero concluded that a true comparison will have to wait until the launch of Gemini Advanced, at which point users will be able to compare the performance of the 2 models and decide for themselves which one is better.