OpenAI's Most Intelligent, Affordable AI
Plus: Mistral AI and NVIDIA collaborate to release a new model, TTT models might be the next frontier in generative AI.
Hello Engineering Leaders and AI Enthusiasts!
Welcome to the 321st edition of The AI Edge newsletter. This edition features OpenAI’s new small AI model, GPT-4o mini.
And a huge shoutout to our amazing readers. We appreciate you😊
In today’s edition:
🤖 OpenAI introduces GPT-4o mini, its most affordable model
🚀 Mistral AI and NVIDIA collaborate to release a new model
⚡ TTT models might be the next frontier in generative AI
📚 Knowledge Nugget: RAG is more than just vectors by
Let’s go!
OpenAI introduces GPT-4o mini, its most affordable model
OpenAI has introduced GPT-4o mini, its most intelligent, cost-efficient small model. It supports text and vision in the API, with support for text, image, video and audio inputs and outputs coming in the future. The model has a context window of 128K tokens, supports up to 16K output tokens per request, and has knowledge up to October 2023.
GPT-4o mini scores 82% on MMLU and currently outperforms GPT-4 on chat preferences in the LMSYS leaderboard. It is more affordable than previous frontier models and more than 60% cheaper than GPT-3.5 Turbo.
Why does it matter?
It has been a huge week for small language models (SLMs), with GPT-4o mini, Hugging Face’s SmolLM, and NeMO, Mathstral, and Codestral Mamba from Mistral. GPT-4o mini should significantly expand the range of applications built with AI by making intelligence much more affordable.
Mistral AI and NVIDIA collaborate to release a new model
Mistral releases Mistral NeMo, its new best small model with a large context window of up to 128k tokens. It was built in collaboration with NVIDIA and released under the Apache 2.0 license.
Its reasoning, world knowledge, and coding accuracy are state-of-the-art in its size category. Relying on standard architecture, Mistral NeMo is easy to use and a drop-in replacement for any system using Mistral 7B. It is also on function calling and is particularly strong in English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, and Hindi.
Why does it matter?
The model is designed for global, multilingual applications with excellence in many languages. This could be a new step toward bringing frontier AI models to everyone’s hands in all languages that form human culture.
TTT models might be the next frontier in generative AI
Transformers have long been the dominant architecture for AI, powering OpenAI’s Sora, GPT-4o, Claude, and Gemini. But they aren’t especially efficient at processing and analyzing vast amounts of data, at least on off-the-shelf hardware.
Researchers at Stanford, UC San Diego, UC Berkeley, and Meta proposed a promising new architecture this month. The team claims that Test-Time Training (TTT) models can not only process far more data than transformers but that they can do so without consuming nearly as much compute power. Here is the full research paper.
Why does it matter?
On average, a ChatGPT query needs nearly 10x as much electricity to process as a Google search. It may be too early to claim if TTT models will eventually supersede transformers. But if they do, it could allow AI capabilities to grow sustainably.
Enjoying the daily updates?
Refer your pals to subscribe to our daily newsletter and get exclusive access to 400+ game-changing AI tools.
When you use the referral link above or the “Share” button on any post, you'll get the credit for any new subscribers. All you need to do is send the link via text or email or share it on social media with friends.
Knowledge Nugget: RAG is more than just vectors
This article by @joel challenges the common perception that Retrieval-Augmented Generation (RAG) is solely tied to vector databases. The author contends that RAG is essentially about providing context to LLMs from any data source.
RAG can be implemented using various types of databases and data retrieval methods that developers are already familiar with. It compares RAG to function calls in programming, suggesting that it's simply a way to fetch relevant data for the LLM to process.
Why does it matter?
By framing RAG in broader context, this demystifies the concept and encourage developers to leverage their existing skills and infrastructure when implementing RAG, rather than feeling compelled to adopt new, specialized systems.
What Else Is Happening❗
🔓OpenAI gives customers more control over ChatGPT Enterprise
OpenAI is launching tools to support enterprise customers with managing their compliance programs, enhancing data security, and securely scaling user access. It includes new Enterprise Compliance API, SCIM (System for Cross-domain Identity Management), expanded GPT controls, and more. (Link)
🤝AI industry leaders have teamed up to promote AI security
Google, OpenAI, Microsoft, Anthropic, Nvidia, and other big names in AI have formed the Coalition for Secure AI (CoSAI). The initiative aims to address a “fragmented landscape of AI security” by providing access to open-source methodologies, frameworks, and tools. (Link)
📈DeepSeek open-sources its LLM ranking #1 on the LMSYS leaderboard
DeepSeek has open-sourced DeepSeek-V2-0628, the No.1 open-source model on the LMSYS Chatbot Arena Leaderboard. It ranks #11, outperforming all other open-source models. (Link)
🏆Groq’s open-source Llama AI model tops GPT-4o and Claude
Groq released two open-source models specifically designed for tool use, built with Meta Llama-3. The Llama-3-Groq-70B-Tool-Use model tops the Berkeley Function Calling Leaderboard (BFCL), outperforming offerings from OpenAI, Google, and Anthropic. (Link)
🗣️Apple, Salesforce break silence on claims they used YouTube videos to train AI
Apple clarified that its OpenELM language model used the dataset for research purposes only and will not be used in any Apple products/services. Salesforce commented that the dataset was publicly available and released under a permissive license. (Link)
New to the newsletter?
The AI Edge keeps engineering leaders & AI enthusiasts like you on the cutting edge of AI. From machine learning to ChatGPT to generative AI and large language models, we break down the latest AI developments and how you can apply them in your work.
Thanks for reading, and see you tomorrow. 😊
Glad you enjoyed my post! Thanks for spreading the idea to a broader audience