Intel's New AI Chip: 50% Faster, Cheaper Than NVIDIA's
Plus: Meta to release Llama 3 open-source LLM next week, Google Cloud announces major updates to enhance Vertex AI.
Hello Engineering Leaders and AI Enthusiasts!
Welcome to the 250th edition of The AI Edge newsletter. This edition features “Intel's New AI Chip: 50% Faster, Cheaper Than NVIDIA's.”
And a huge shoutout to our amazing readers. We appreciate you😊
In today’s edition:
🧠 Intel's new AI chip: 50% faster, cheaper than NVIDIA's
and
🤖 Meta to Release Llama 3 Open-source LLM next week
☁️ Google Cloud announces major updates to enhance Vertex AI
📚 Knowledge Nugget: Making peace with LLM non-determinism by
Let’s go!
Intel's new AI chip: 50% faster, cheaper than NVIDIA's
Intel has unveiled its new Gaudi 3 AI accelerator, which aims to compete with NVIDIA's GPUs. According to Intel, the Gaudi 3 is expected to reduce training time for large language models like Llama2 and GPT-3 by around 50% compared to NVIDIA's H100 GPU. The Gaudi 3 is also projected to outperform the H100 and H200 GPUs in terms of inference throughput, with around 50% and 30% faster performance, respectively.
The Gaudi 3 is built on a 5nm process and offers several improvements over its predecessor, including doubling the FP8, quadrupling the BF16 processing power, and increasing network and memory bandwidth. Intel is positioning the Gaudi 3 as an open, cost-effective alternative to NVIDIA's GPUs, with plans to make it available to major OEMs starting in the second quarter of 2024. The company is also working to create an open platform for enterprise AI with partners like SAP, Red Hat, and VMware.
Why does it matter?
Intel is challenging NVIDIA's dominance in the AI accelerator market. It will introduce more choice and competition in the market for high-performance AI hardware. It could drive down prices, spur innovation, and give customers more flexibility in building AI systems. The open approach with community-based software and standard networking aligns with broader trends toward open and interoperable AI infrastructure.
Meta to release Llama 3 open-source LLM next week
Meta plans to release two smaller versions of its upcoming Llama 3 open-source language model next week. These smaller models will build anticipation for the larger version, which will be released this summer. Llama 3 will significantly upgrade over previous versions, with about 140 billion parameters compared to 70 billion for the biggest Llama 2 model. It will also be a more capable, multimodal model that can generate text and images and answer questions about images.
The two smaller versions of Llama 3 will focus on text generation. They’re intended to resolve safety issues before the full multimodal release. Previous Llama models were criticized as too limited, so Meta has been working to make Llama 3 more open to controversial topics while maintaining safeguards.
Why does it matter?
The open-source AI model landscape has become much more competitive in recent months, with other companies like Mistral and Google DeepMind also releasing their own open-source models. Meta hopes that by making Llama 3 more open and responsive to controversial topics, it can catch up to models like OpenAI's GPT-4 and become a standard for many AI applications.
Google Cloud announces major updates to enhance Vertex AI
Google Cloud has announced exciting model updates and platform capabilities that continue to enhance Vertex AI:
Gemini 1.5 Pro: Gemini 1.5 Pro is now available in public preview in Vertex AI, the world’s first one million-token context window to customers. It also supports the ability to process audio streams, including speech and even the audio portion of videos.
Imagen 2.0: Imagen 2.0 can now create short, 4-second live images from text prompts, enabling marketing and creative teams to generate animated content. It also has new image editing features like inpainting, outpainting, and digital watermarking.
Gemma: Google Cloud is adding CodeGemma to Vertex AI. CodeGemma is a new lightweight model from Google's Gemma family based on the same research and technology used to create Gemini.
MLOps: To help customers manage and deploy these large language models at scale, Google has expanded the MLOps capabilities for Gen AI in Vertex AI. This includes new prompt management tools for experimenting, versioning, optimizing prompts, and enhancing evaluation services to compare model performance.
Why does it matter?
These updates significantly enhance Google Cloud's generative AI offerings. It also strengthens Google's position in the generative AI space and its ability to support enterprise adoption of these technologies.
Enjoying the daily updates?
Refer your pals to subscribe to our daily newsletter and get exclusive access to 400+ game-changing AI tools.
When you use the referral link above or the “Share” button on any post, you'll get the credit for any new subscribers. All you need to do is send the link via text or email or share it on social media with friends.
Knowledge Nugget: Making peace with LLM non-determinism
In this article,
and discuss the problem of non-determinism in LLMs. In an attempt to understand the problem of non-determinism in LLMs, the author reads, researches, and consults his friend. He takes us through the understanding journey/process, discussing various aspects such as misconceptions about sampling, hardware non-determinism, etc.In this process, he finds or concludes that what makes working with LLMs feel random is not just the occasional non-determinism but language itself. The author suggests solutions to these problems, such as using more accurate and steerable models, reducing unnecessary non-determinism, decreasing input ambiguity, and defining clear evaluation criteria.
Why does it matter?
This article explores why LLMs can be unpredictable even when given the same input and argues it's not just due to technical glitches but because language itself is ambiguous. This matters because we might need to accept some level of randomness in LLMs and develop new techniques to manage it rather than solely focus on eliminating it.
What Else Is Happening❗
🚀 OpenAI launches GPT-4 Turbo with Vision model through API
OpenAI has unveiled the latest addition to its AI arsenal, the GPT -4 Turbo with Vision model, which is now “generally available” through its API. This new version has enhanced capabilities, including support for JSON mode and function calling for Vision requests. The upgraded GPT-4 Turbo model promises improved performance and is set to roll out in ChatGPT. (Link)
👂 Google’s Gemini 1.5 Pro can now listen to audio
Google’s update to Gemini 1.5 Pro gives the model ears. It can process text, code, video, and uploaded audio streams, including audio from video, which it can listen to, analyze, and extract information from without a corresponding written transcript.(Link)
💰 Microsoft to invest $2.9 billion in Japan’s AI and cloud infrastructure
Microsoft announced it would invest $$2.9 billion over the next two years to increase its hyperscale cloud computing and AI infrastructure in Japan. It will also expand its digital skilling programs with the goal of providing AI skills to more than 3 million people over the next three years. (Link)
👩💻 Google launches Gemini Code Assist, the latest challenger to GitHub’s Copilot
At its Cloud Next conference, Google unveiled Gemini Code Assist, its enterprise-focused AI code completion and assistance tool. It provides various functions such as enhanced code completion, customization, support for various repositories, and integration with Stack Overflow and Datadog. (Link)
🛍️ eBay launches AI-driven ‘Shop the Look’ feature on its iOS app
eBay launched an AI-powered feature to appeal to fashion enthusiasts - “Shop the Look” on its iOS mobile application. It will suggest a carousel of images and ideas based on the customer’s shopping history. The recommendations will be personalized to the end user. The idea is to introduce how other fashion items may complement their current wardrobe. (Link)
New to the newsletter?
The AI Edge keeps engineering leaders & AI enthusiasts like you on the cutting edge of AI. From machine learning to ChatGPT to generative AI and large language models, we break down the latest AI developments and how you can apply them in your work.
Thanks for reading, and see you tomorrow. 😊