Key Takeaways
In 2017, a groundbreaking Google research paper introduced a new architecture for neural networks known as the transformer.
The transformer architecture fundamentally changed the trajectory of artificial intelligence and today underpins the powerful language models developed by Google, OpenAI and other leading AI labs.
But what happened to the Google Brain researchers who laid the groundwork for this revolution?
“Attention Is All You Need” was published in June 2017 by a team of researchers at Google Brain, a division of Google’s AI research initiative.
The paper proposed a new natural language programming framework that eliminated the need for recurrence (used in traditional recurrent neural networks) in favor of a mechanism known as self-attention.
Self-attention enables AI models to weigh the importance of different words in a sequence without requiring them to be processed in order, a limitation of previous designs.
This approach means models can scale more efficiently, handle larger datasets, and generate more coherent and contextually aware outputs.
The concept gained mainstream awareness via OpenAI’s general pretrained transformer (GPT) series. Other leading foundation models that use a similar architecture include Google’s Gemini, Meta’s Llama and Anthropic’s Claude.
Since the publication of “Attention Is All You Need,” the paper’s eight authors have been celebrated as technological visionaries who helped kickstart the modern AI boom.
But the Google dream team didn’t stay together for long and each of them has since pursued new opportunities away from their former employer.
Although transformers were originally designed to process language data, Ashish Vaswani was among several of the technology’s inventors who also hold significant expertise in the field of computer vision.
Vaswani’s interest in the intersection of NLP and computer vision can be seen in his work post-Google, which extends the notion of self-attention in new directions, such as image classification .
Like Vaswani, Niki Parmar’s research focuses on both linguistic and image-based challenges. Since 2022, the two Google Brain alumni have teamed up to co-found not one but two startups together.
Vaswani and Palmer’s first venture was Adept AI, a startup developing multimodal AI solutions based on the transformer architecture.
Their latest entrepreneurial endeavor, Essential AI, is similarly located at the cutting edge of multimodal AI development.
With a mission to create new business tools and “full-stack AI products,” Essential emerged from stealth last year and has already raised nearly $65 million in two funding rounds.
Of the AI entrepreneurs from the Google Brain class of 2017, none have had more success in business than Aidan Gomez.
The startup Gomez founded in 2019, Cohere, is now one of the world’s leading AI labs.
Much like Vaswani and Palmer’s ventures, Cohere has opted to focus on the enterprise market, providing custom large language model solutions for businesses.
Building on the increasingly sophisticated capabilities of transformer-based models, Cohere counts Spotify, Glean, and Oracle among its clients and was valued at $5.5 billion at its last funding round in July.
After “Attention Is All You Need,” Noam Shazeer was celebrated as one of Google’s brightest AI visionaries. He played a key role in developing the company’s first transformer-based chatbot, Meena.
Before ChatGPT propelled the technology onto the world stage, Shazeer circulated an internal memo, in which he predicted the revolution transformers were about to unleash. However, following disagreements with Google’s top brass, he left the company in 2021.
Out on his own, Shazeer founded Character.AI, a startup focused on customizable chatbots powered by a language model that excels at the art of personality.
Looking back, Shazeer’s foresight was bang on and his belief that Google should be less risk-averse and throw its weight behind the emerging chatbot technology was probably correct.
This year, Google reached a $2.7 billion licensing agreement with Character.AI that has seen Shazeer return to his old employer to oversee Gemini development.
Jakob Uszkoreit was another key author behind the transformer paper, and his expertise in natural language understanding helped shape its development. After the paper’s release, Uszkoreit continued to work on some of Google’s most advanced AI products.
However, his career took a new direction when he left Google to explore the intersection of AI and biotechnology.
In 2020, Uszkoreit co-founded Inceptive, a company that leverages AI to develop mRNA-based medicines, accelerating drug discovery and the design of new treatments.
The transformer’s ability to process vast amounts of biological data has made it an ideal tool for this new venture. Uszkoreit’s work at Inceptive showcases how the technology is increasingly finding a role across diverse industries.
Llion Jones, one of the lesser-known but equally important co-authors of “Attention Is All You Need,” has continued to work on transformers, focusing on improving large language model (LLM) robustness and efficiency.
Like several other of his former colleagues, Jones’s career post-Google has seen him co-found an AI lab to research and develop new LLM technologies.
Sakana AI’s list of accomplishments is already impressive. Earlier this year, the startup discovered a way to more efficiently train LLMs using LLMs.
More recently, Sakana showcased an “AI scientist” that aims to automate the entire research and development process, generating ideas, conducting experiments, summarizing results and writing papers.
Of all the researchers who co-authored “Attention Is All You Need,” Illia Polosukhin’s journey since 2017 has been the most unexpected.
After leaving Google, Polosukhin teamed up with Alexander Skidanov to create NEAR.ai.
Initially focused on program synthesis, Polosukhin and Skidanov soon became interested in smart contracts and in 2018 started building NEAR Protocol.
Unlike most of his colleagues at Google, Polosukhin’s work today doesn’t obviously relate to transformers. Nevertheless, his background in AI has certainly informed NEAR’s development and the scalable, high-performance blockchain platform is among the most capable of supporting AI-based applications.
Having already made a significant contribution to deep learning research through his work at Google, since moving to OpenAI in 2021, Lukasz Kaiser has continued to be a leading figure in LLM research and development.
In a sense, Kaiser’s journey from Google to OpenAI mirrors the trajectory of the transformer architecture itself. After all, although Google laid the foundations for the technology, from GPT-3 onwards, OpenAI has stood at the forefront of innovation.
With his colleagues at OpenAI, Kaiser has pioneered new modes of AI reasoning that have propelled the company’s models to ever-greater performance levels.
Having led the development of OpenAI’s o1 models, Kaiser recently expressed his view that the “transformer with chain of thought” architecture represents the next stage in AI evolution.
If transformers enabled AI models to digest previously unprecedented volumes of data, Kaiser suggested that the days of more data equals more intelligence are coming to an end.
Going forward, he suggested that post-training design principles will become more important as researchers start to structure the powerful language processing capabilities of transformer models in more effective ways.
Between them, the eight authors of “Attention Is All You Need” are now working on some of the most cutting-edge AI projects today.
Those who have ventured into entrepreneurialism reflect how transformers have grown from an experimental technology to a multi-trillion-dollar business with a growing range of real-world applications.
In a sense, AI labs can be thought of as riding the first wave of an ongoing LLM revolution, building market-ready solutions to feed businesses’ growing AI demand for chatbots.
Meanwhile, Uszkoreit’s Inceptive and Jones’s Sakana are exploring new possibilities for the technology, pointing to emerging AI technologies that could be the next big thing.