Google’s Gemini May Kill GPT-4
Plus: Meta’s new image AI and core AI experiences for its apps, Apple releases MLX.
Hello Engineering Leaders and AI Enthusiasts!
Welcome to the 163rd edition of The AI Edge newsletter. This edition brings you all the details about the (alleged) ChatGPT killer, Google’s Gemini.
And a huge shoutout to our amazing readers. We appreciate you😊
In today’s edition:
🚀 Google launches Gemini, its largest, most capable model yet
📱
Meta’s new image AI and core AI experiences across its apps family
🛠️ Apple quietly releases a framework, MLX, to build foundation models
📚 Knowledge Nugget: Machine Reasoning: The forgotten side of AI by
Let’s go!
We need your help!
We are working on a Gen AI survey and would love your input.
It takes just 2 minutes.
The survey insights will help us both.
And hey, you might also win a $100 Amazon gift card!
Every response counts. Thanks in advance!
Google launches Gemini, its largest, most capable model yet
It looks like ChatGPT’s ultimate competitor is here. After much anticipation, Google has launched Gemini, its most capable and general model yet. Here’s everything you need to know:
Built from the ground up to be multimodal, it can generalize and understand, operate across and combine different types of information, including text, code, audio, image, and video. (Check out this incredible demo)
Its first version, Gemini 1.0, is optimized for different sizes: Ultra for highly complex tasks, Pro for scaling across a wide range of tasks, and Nano as the most efficient model for on-device tasks.
Gemini Ultra’s performance exceeds current SoTA results on 30 of the 32 widely-used academic benchmarks used in LLM R&D.
With a score of 90.0%, Gemini Ultra is the first model to outperform human experts on MMLU.
It has next-gen capabilities– sophisticated reasoning, advanced math and coding, and more.
Gemini 1.0 is now rolling out across a range of Google products and platforms– Pro in Bard (Bard will now be better and more usable), Nano on Pixel, and Ultra will be rolling out early next year.
Why does this matter?
Gemini outperforms GPT-4 on a range of multimodal benchmarks, including text and coding. Gemini Pro outperforms GPT-3.5 on 6/8 benchmarks, making it the most powerful free chatbot out there today. It highlights Gemini’s native multimodality that can threaten OpenAI’s dominance and indicate early signs of Gemini's more complex reasoning abilities.
However, the true test of Gemini’s capabilities will come from everyday users. We'll have to wait and see if it helps Google catch up to OpenAI and Microsoft in the race to build great generative AI.
Meta’s new image AI and core AI experiences across its apps family
Meta is rolling out a new, standalone generative AI experience on the web, Imagine with Meta, that creates images from natural language text prompts. It is powered by Meta’s Emu and creates 4 high-resolution images per prompt. It’s free to use (at least for now) for users in the U.S. It is also rolling out invisible watermarking to it.
Meta is also testing more than 20 new ways generative AI can improve your experiences across its family of apps– spanning search, social discovery, ads, business messaging, and more. For instance, it is adding new features to the messaging experience while also leveraging it behind the scenes to power smart capabilities.
Another instance, it is testing ways to easily create and share AI-generated images on Facebook.
Why does this matter?
Meta has been at the forefront of AI research which will help unlock new capabilities in its products over time, akin to other Big Techs. And while it still just scratching the surface of what AI can do, it is continually listen to people’s feedback and improving.
Apple quietly releases a framework to build foundation models
Apple’s ML research team released MLX, a machine learning framework where developers can build models that run efficiently on Apple Silicon and deep learning model library MLX Data. Both are accessible through open-source repositories like GitHub and PyPI.
MLX is intended to be easy to use for developers but has enough power to train AI models like Meta’s Llama and Stable Diffusion. The video is a Llama v1 7B model implemented in MLX and running on an M2 Ultra.
Why does this matter?
Frameworks and model libraries help power many of the AI apps in the market now. And Apple, thought seen as conservative, has joined the fray with frameworks and model libraries tailored for its chips, potentially enabling generative AI applications on MacBooks. With MLX, you can:
Train a Transformer LM or fine-tune with LoRA
Text generation with Mistral
Image generation with Stable Diffusion
Speech recognition with Whisper
Enjoying the daily updates?
Refer your pals to subscribe to our daily newsletter and get exclusive access to 400+ game-changing AI tools.
When you use the referral link above or the “Share” button on any post, you'll get the credit for any new subscribers. All you need to do is send the link via text or email or share it on social media with friends.
Knowledge Nugget: Machine Reasoning: The forgotten side of AI
With the incredible achievement of ChatGPT, other LLMs, and image/audio/video generation tech, all the focus is on neural networks.
But neural networks are not the only approach to AI.
Not only are there many other forms of Machine Learning (ML), but there is also Machine Reasoning (MR).
In this intriguing on
newsletter, discusses MR, where the term came from (it is made up🤫), how it distinguishes from ML, and MR technologies.Why does this matter?
While deep learning steals the spotlight, it is worth considering that ML and MR can be combined. It can potentially address complex problems and create innovative solutions by leveraging the strengths of each approach.
What Else Is Happening❗
💻Google unveils AlphaCode 2, powered by Gemini.
It is an improved version of the code-generating AlphaCode introduced by Google’s DeepMind lab roughly a year ago. In a subset of programming competitions hosted on Codeforces, a platform for programming contests, AlphaCode 2– coding in languages Python, Java, C++, and Go– performed better than an estimated 85% of competitors. (Link)
☁️Google announces the Cloud TPU v5p, its most powerful AI accelerator yet.
With Gemini’s launch, Google also launched an updated version of its Cloud TPU v5e, which launched into general availability earlier this year. A v5p pod consists of a total of 8,960 chips and is backed by Google’s fastest interconnect yet, with up to 4,800 Gpbs per chip. Google observed 2X speedups for LLM training workloads using TPU v5p vs. v4. (Link)
🚀AMD’s Instinct MI300 AI chips to challenge Nvidia; backed by Microsoft, Dell, And HPE.
The chips– which are also getting support from Lenovo, Supermicro, and Oracle– represent AMD’s biggest challenge yet to Nvidia’s AI computing dominance. It claims that the MI300X GPUs, which are available in systems now, come with better memory and AI inference capabilities than Nvidia’s H100. (Link)
🍟McDonald’s will use Google AI to make sure your fries are fresh, or something?
McDonald’s is partnering with Google to deploy generative AI beginning in 2024 and will be able to use GenAI on massive amounts of data to optimize operations. At least one outcome will be– according to the company– “hotter, fresher food” for customers. While that’s unclear, we can expect more AI-driven automation at the drive-throughs. (Link)
🔒Gmail gets a powerful AI update to fight spam with the 'RETVec' feature.
The update, known as RETVec (Resilient and Efficient Text Vectorizer), helps make text classifiers more efficient and robust. It works conveniently across all languages and characters. Google has made it open-source, allowing developers to use its capabilities to invent resilient and efficient text classifiers for server-side and on-device applications. (Link)
That's all for now!
Subscribe now to join the prestigious readership of The AI Edge alongside professionals from Moody’s, Vonage, Voya, WEHI, Cox, INSEAD, and other top organizations.
Thanks for reading, and see you tomorrow. 😊