Key Takeaways
Last week, OpenAI launched GPT-4o mini, billing the new model as a more cost-efficient alternative to the full-size GPT-4o.
The scaled-down GPT model promises the same core functionalities as its predecessor at a fraction of the price. But with a smaller model comes trade-offs.
In a statement on Thursday, July 18, OpenAI said the new large language model is priced at $0.15 per million input tokens and $0.6 per million output tokens.
That makes it just 3% of the cost of the standard GPT-4o model and more than 60% cheaper than GPT-3.5 Turbo.
While affordability is a major advantage, it comes at the expense of raw power.
As a smaller model, GPT-4o mini is more limited in its ability to handle complex tasks and generate nuanced responses.
Nonetheless, early reviews have largely praised the new AI model’s capabilities, with users reporting that outputs are sufficiently sophisticated given the significant token savings on offer.
Ultimately, the choice between GPT-4o and GPT-4o mini boils down to specific needs and budget constraints.
For users prioritizing affordability and basic functionalities like chatbots or simple content generation, GPT-4o mini offers a compelling option. However, for those requiring more advanced functionality, the larger model remains the superior choice.
While it might not be a direct replacement for its predecessor, the new cost-effective option opens doors for wider adoption and experimentation in AI.
The release of GPT-4o Mini marks an important shift towards more efficient AI solutions.
Solutions like Google’s Gemma and Meta’s LLaMA-3 7B have gained significant traction in 2024.
OpenAI’s latest model dropped on the same day Mistral unveiled NeMo. Both models have a similar context window and parameter count placing them at the larger end of the small model spectrum.
GPT-4o mini has 12 billion parameters, compared to the smallest LLaMA, Claude and Gemini models, which range from 7 billion to 9 billion. Meanwhile, Google’s Gemma comes in 2 billion and 7 billion versions.
In terms of bang for your buck, OpenAI’s SLM takes the lead, with lower token costs than NeMo and equivalent Claude, LLaMA and Gemini models.
In a comparative analysis of their performance, NeMo was found to generate responses faster than GPT-4o mini. However, OpenAI’s model ranked highest among the SLMs for overall quality.
After years of LLMs dominating AI development, the field has now advanced far enough that small language models (SLMs) can effectively condense the learning of their larger cousins with significantly lower computational demands.
With OpenAI now jumping on the SLM train, the era of the mini AI model has well and truly arrived.