Mistral Breaks Barriers with New Multimodal AI

Plus: OpenAI debuts o1 models, Google AI studio launches comparison mode, Luma drops Dream Machine API, and more.

Sep 17, 2024

Hello Engineering Leaders and AI Enthusiasts!

This newsletter brings you the latest AI updates in just 4 minutes! Dive in for a quick summary of everything important that happened in AI over the last week.

And a huge shoutout to our amazing readers. We appreciate you😊

In today’s edition:

🆕 Meet Pixtral 12B - Mistral's new Multimodal AI Model
🤖 OpenAI debuts O-1 models boasting PhD-level performance
🔬 Google AI Studio launches model comparison mode
💰 Meta Connect 2024: What to expect?
🧠 Luma drops Dream Machine API just hours after Runway
🚀OpenAI’s new ‘Safety Board’ can halt model releases
📚 Knowledge Nugget: Role-playing with AI will be a powerful tool for writers and educators by @ Role-playing with AI will be a powerful tool for writers and educators by
Res Obscura

Let’s go!

Meet Pixtral 12B - Mistral's new Multimodal AI Model

Mistral has released a new AI model that can perform image and text processing. The 12-billion parameter model is built on Mistral’s Nemo 12B text models and displays capabilities like problem-solving. It uses a binary-to-text encoding scheme to perform tasks like image captioning and counting the number of objects in a photo.

Pixtral 12B can be accessed via a GitHub link or the Hugging Face platform. Users can download, fine-tune, and use it under an Apache 2.0 license without any restrictions.

Why does it matter?

Pixtral 12B's open-access nature and versatile capabilities make it a game-changer for AI enthusiasts and tech innovators. Additionally, its competitive positioning against established players like OpenAI and Anthropic could drive further competition and advancements in multimodal AI.

Source

OpenAI debuts O-1 models boasting PhD-level performance

OpenAI has released two new reasoning models, o1-preview and o1-mini designed to excel at complex reasoning and problem-solving tasks. Here’s a quick peek at some of their impressive features:

The o1-preview excels at complex tasks, while the o1-mini offers a faster, more cost-effective alternative for STEM fields like coding and mathematics.
Both models use reinforcement learning and advanced chain-of-thought processes to handle complex problems, which helps them excel at competitive programming, mathematics, and science and tackle hallucinations.
The o1 models use an advanced safety mechanism and showcase excellent performance in content evaluations and resistance against jailbreaks.
The o1-preview performed better in fairness evaluation than GPT -4 and handled ambiguous questions better.

Why does it matter?

The high reasoning capabilities showcased by these models can be helpful to diverse fields. Physicists can use o1 models to generate complex mathematical formulas required for quantum optics; developers can use them to execute multi-step workflows and more.

Source

Google AI Studio launches model comparison mode

The feature lets users easily view how different AI models and parameters affect outputs and evaluate model progress and speed differences. The feature works for both texts and multi-modal AI.

Developers can access this feature through the UI's top-level “Compare” button. The launch of the model comparison feature makes it easier for them to monitor model progress and speed differences.

Why does it matter?

This feature will save developers significant time by simplifying the evaluation process, allowing them to quickly compare model outputs and performance metrics without the need for extensive manual testing or analysis.

Source

Meta Connect 2024: What to expect?

The two-day event, which kicks off on September 25, will highlight the company’s latest hardware and software innovations. Here’s what to expect from Meta Connect 2024:

The event will discuss Meta’s latest offerings, like its AR/VR headsets, smart glasses and wearables, and AI divisions, in a keynote presented by its CEO, Mark Zuckerberg.
Meta will likely unveil Orion - its next-generation augmented reality glasses that can layer holographic imagery on top of reality.
Meta is also rumored to be releasing new Ray-Ban smart glasses with a built-in screen, camera, speaker, and microphone.
It might also reveal a new version of the Quest 3 headset, Quest 3S. Meta is trying to make them cost-effective, hinting that it could probably replace the Quest 2.

Why does it matter?

With major players like Apple having recently launched the Vision Pro, Meta's unveiling of the Orion AR glasses and a more affordable Quest 3S headset could intensify competition. It would be interesting to see if Meta surpasses Apple in user experience and functionality.

Source

Luma drops Dream Machine API just hours after Runway

Luma has announced the release of an API for its Dream Machine video generation model. The API will bring AI technology to more apps, teams, and users worldwide.

Luma’s API will showcase capabilities like:

Text-to-Video: Users can directly generate videos via text instructions without any prompt engineering.
Image-to-Video: The feature will use natural language commands to transform static images into high-quality animations.
Video Extension and Looping: Users can extend video sequences and create endless loops.
Camera Motion Control: Users can utilize simple text inputs to direct video scenes.

Why does it matter?

While rivals like OpenAI’s Sora offer restricted access to users, Luma AI’s API democratizes access to advanced video generation technology for users like content creators and developers, allowing them to enter a simple prompt. Looks like Luma’s rivals might have to up their game or be replaced.

Source

OpenAI’s new ‘Safety Board’ can halt model releases

OpenAI Board has formed a Safety and Security Committee, which will make recommendations to the entire Board on critical safety and security decisions for OpenAI projects and operations.

Its first task would be to evaluate and develop OpenAI’s processes and safeguards over the next 90 days. Next, the committee will share its recommendations for the Board’s review.

After the full Board’s review, Open AI will publicly share an update on adopted recommendations. This independent Board will also oversee major model releases, exercise oversight over launches, and delay launches over safety concerns.

Why does it matter?

OpenAI has demonstrated a solid commitment to responsible AI development amid discussions surrounding the need for ethical AI. By evaluating and enhancing existing processes and delaying model launches until safety concerns are addressed, OpenAI will likely encourage other AI companies to prioritize safe and secure AI development.

Source

Enjoying the latest AI updates?

Refer your pals to subscribe to our newsletter and get exclusive access to 400+ game-changing AI tools.

Refer a friend

When you use the referral link above or the “Share” button on any post, you'll get the credit for any new subscribers. All you need to do is send the link via text or email or share it on social media with friends.

Knowledge Nugget: Role-playing with AI will be a powerful tool for writers and educators

Historian

Res Obscura

argues that AI's "hallucinations" can help foster historical empathy. In this article, Breen travels through time and brings AI along. In his latest experiment, Breen used GPT-4 to simulate and recreate a 17th-century doctor's visit. These AI-powered historical simulations aren't just for fun—Breen has been using them in his world history classes at UC Santa Cruz, with 84% of students reporting that the exercises enhanced their understanding of historical periods.

Some key points Breen talks about in this article include:

He tested AI simulations on content from his books "The Age of Intoxication" and "Tripping on Utopia."
He used AI-generated historical simulations in his world history class at UCSC.
Students reported that these simulations enhanced their understanding of historical periods.
Breen sees potential in AI for "experiential learning" in history education.
He explored using AI to simulate a 1680s physician's diagnosis and treatment.
Breen created a custom GPT model to more accurately represent 17th-century medical practices.

Why does it matter?

While AI might not be ready to replace human historians, it's proving to be a powerful tool for experiential learning. By allowing students to "interact" with the past, these simulations could revolutionize how we teach and understand history.

Source

What Else Is Happening❗

🤖 Google’s new robots - Aloha Unleashed and DemoStart, demonstrated impressive dexterity, performing tasks like tying a shoelace, hanging a shirt, and cleaning a kitchen.

🛠️ Salesforce released Agentforce, a suite of low-code tools to build autonomous AI agents that can perform reasoning for sales, marketing, and commerce-related tasks.

🚀 Hume AI introduces EVI-2, a voice-to-voice model that can emulate diverse personalities, accents, speaking styles, and multiple speaking rates.

🔥 Adobe Firefly will receive video-generation features like Generative Extend, Text to Video, and Image to Video by the end of 2024.

💬 Amazon has started experimenting with ads in its Rufus chatbot. Based on Amazon search and conversational context, the ads may soon appear for users in the US.

📄 A new UK study led by a team of firefighters, engineers, and scientists reveals that AI-piloted drones may be able to prevent wildfires by spotting and putting out flames.

⚙️ SambaNova Systems has offered a high-speed, open-source alternative to OpenAI’s o-1 model, boasting fast processing capabilities via a Hugging Face demo.

🚀 Microsoft launches new Copilot features like Python support in Excel, a narrative builder in PowerPoint, text summaries for Teams, and improvements in Word and OneDrive.

📝 Slack unveils an AI-powered note-taking tool capable of summarizing meetings with AI or Google Meet.

🔧 AI startup World Labs plans to build AI models with spatial intelligence designed to generate, perceive, and interact with 3D environments and navigate physical spaces.

New to the newsletter?

The AI Edge keeps engineering leaders & AI enthusiasts like you on the cutting edge of AI. From machine learning to ChatGPT to generative AI and large language models, we break down the latest AI developments and how you can apply them in your work.

Thanks for reading, and see you next week! 😊

The AI Edge

Mistral Breaks Barriers with New Multimodal AI

Plus: OpenAI debuts o1 models, Google AI studio launches comparison mode, Luma drops Dream Machine API, and more.

Meet Pixtral 12B - Mistral's new Multimodal AI Model

OpenAI debuts O-1 models boasting PhD-level performance

Google AI Studio launches model comparison mode

Meta Connect 2024: What to expect?

Luma drops Dream Machine API just hours after Runway

OpenAI’s new ‘Safety Board’ can halt model releases

Enjoying the latest AI updates?

Knowledge Nugget: Role-playing with AI will be a powerful tool for writers and educators

What Else Is Happening❗

New to the newsletter?

Discussion about this post