Key Takeaways
Large Language Models (LLMs) sit at the heart of the contemporary AI boom, powering the likes of ChatGPT and almost every other popular AI application today. From automatic text transcribers to image generators, if it uses AI, there’s a good chance an LLM is running in the background.
But Google’s latest language models, a family of open-source AI systems called Gemma, are significantly more lightweight than the LLMs that have dominated the narrative for so long.
As the name suggests, Large Language Models aren’t exactly nimble, with the most advanced models today consisting of over a trillion parameters.
Parameters in AI are the variables that models use to make predictions or decisions. The massive amount of parameters in modern LLMs is the result of training them on vast data sets, often for months at a time.
This intense training carried out by the likes of Google and OpenAI has been invaluable for AI research. But LLMs require substantial resources to develop and operate. Hence the explosion of interest in expensive AI hardware and custom-built data centers that can run the most intensive machine learning (ML) workloads.
However, efforts to condense LLM learnings into smaller models have yielded impressive results. An emerging cohort of SLMs developed by Google, Meta, Mistral and others can now perform many of the same tasks as their larger cousins with significantly reduced computational demands.
According to leaked documents , OpenAI’s GPT-4 has 1.8 trillion parameters. No wonder the average semiconductor is unable to process the model efficiently.
In contrast, Google’s Gema comes in a 2 billion parameter and a 7 billion parameter version.
“Gemma models share technical and infrastructure components with Gemini,” Google’s largest AI model, the company stated .
“Gemma models are capable of running directly on a developer laptop or desktop computer,” it added. However, it “surpasses significantly larger models on key benchmarks.”
The primary appeal of SLMs lies in their efficiency. They offer faster training and inference times, reduced carbon and water footprints, and are more suitable for deployment on edge devices like mobile phones.
As well as being less resource intensive, smaller models are also more efficient learners, requiring less training data than equivalent LLMs. In turn, this makes small models more suited to fine-tuning for specialized tasks.