AI Mind-Reading With Near Perfection
Plus: ElevenLabs has exciting AI voice updates, A French AI startup launches ‘real-time’ AI voice assistant.
Hello Engineering Leaders and AI Enthusiasts!
Welcome to the 311th edition of The AI Edge newsletter. This edition features a new AI system that decodes brain activity with near perfection.
And a huge shoutout to our amazing readers. We appreciate you😊
In today’s edition:
🎯 New AI system decodes brain activity with near perfection
⚡ ElevenLabs has exciting AI voice updates
🤖 A French AI startup launches ‘real-time’ AI voice assistant
📚 Knowledge Nugget: Perplexity’s $3 Billion Valuation: What it Means for AI Search’s Future by
Let’s go!
New AI system decodes brain activity with near perfection
Researchers have developed an AI system that can create remarkably accurate reconstructions of what someone is looking at based on recordings of their brain activity.
In previous studies, the team recorded brain activities using a functional MRI (fMRI) scanner and implanted electrode arrays. Now, they reanalyzed the data from these studies using an improved AI system that can learn which parts of the brain it should pay the most attention to.
As a result, some of the reconstructed images were remarkably close to the images the macaque monkey (in the study) saw.
Why does it matter?
This is probably the closest, most accurate mind-reading accomplished with AI yet. It proves that reconstructed images are greatly improved when the AI learns which parts of the brain to pay attention to. Ultimately, it can create better brain implants for restoring vision.
ElevenLabs has exciting AI voice updates
ElevenLabs has partnered with estates of iconic Hollywood stars to bring their voices to the Reader App. Judy Garland, James Dean, Burt Reynolds, and Sir Laurence Olivier are now part of the library of voices on the Reader App.
(Source)
It has also introduced Voice Isolater. This tool removes unwanted background noise and extracts crystal-clear dialogue from any audio to make your next podcast, interview, or film sound like it was recorded in the studio. It will be available via API in the coming weeks. (Source)
Why does it matter?
ElevenLabs is shipping fast! It appears to be setting a standard in the AI voice technology industry by consistently introducing new AI capabilities with its technology and addressing various needs in the audio industry.
French AI startup launches ‘real-time’ AI voice assistant
A French AI startup, Kyutai, has launched a new ‘real-time’ AI voice assistant named Moshi. It is capable of listening and speaking simultaneously and in 70 different emotions and speaking styles, ranging from whispers to accented speech.
Kyutai claims Moshi is the first real-time voice AI assistant, with a latency of 160ms. You can try it via Hugging Face. It will be open-sourced for research in coming weeks.
Why does it matter?
Yet another impressive competitor that challenges OpenAI's perceived dominance in AI. (Moshi could outpace OpenAI's delayed voice offering.) Such advancements push competitors to improve their offerings, raising the bar for the entire industry.
Enjoying the daily updates?
Refer your pals to subscribe to our daily newsletter and get exclusive access to 400+ game-changing AI tools.
When you use the referral link above or the “Share” button on any post, you'll get the credit for any new subscribers. All you need to do is send the link via text or email or share it on social media with friends.
Knowledge Nugget: Perplexity’s $3 Billion Valuation: What it Means for AI Search’s Future
This article by
discusses Perplexity's rapid growth and the potential of AI search as a major application for LLMs. It also explores Perplexity's strategy, including its approach to model development, competition with major players like Google and OpenAI, new features like Perplexity "Pages", and its focus on complex information queries.The piece provides an overview of the current AI startup landscape, discussing valuations and revenues of various companies. It concludes by considering the potential for future AI-native applications and the factors that might influence their development, such as model capabilities, cost, and speed.
Why does it matter?
AI startups challenging tech giants could reshape how we access and process information. Such insights are crucial for understanding the future direction of search technology, AI development, and the potential shifts in the tech industry's power dynamics.
What Else Is Happening❗
🌐Meta’s multi-token prediction models are now open for research
In April, Meta proposed a new approach for training LLMs to forecast multiple future words simultaneously vs. the traditional method to predict just the next word in a sequence. Meta has now released pre-trained models that leverage this approach. (Link)
🤝Apple to announce AI partnership with Google at iPhone 16 event
Apple has been meeting with several companies to partner with in the AI space, including Google. Reportedly, Apple will announce the addition of Google Gemini on iPhones at its annual event in September. (Link)
📢Google simplifies the process for advertisers to disclose if political ads use AI
In an update to its Political content policy, Google requires advertisers to disclose election ads containing synthetic or digitally altered content. It will automatically include an in-ad disclosure for specific formats. (Link)
🧍♂️WhatsApp is developing a personalized AI avatar generator
It appears to be working on a new Gen AI feature that will allow users to make personalized avatars of themselves for use in any imagined setting. It will generate images using user-supplied photos, text prompts, and Meta’s Llama model. (Link)
🛡️Meta ordered to stop training its AI on Brazilian personal data
Brazil's National Data Protection Authority (ANPD) has decided to suspend with immediate effect the validity of Meta's new privacy policy (updated in May) for using personal data to train generative AI systems in the country. Meta will face daily fines if it fails to comply. (Link)
New to the newsletter?
The AI Edge keeps engineering leaders & AI enthusiasts like you on the cutting edge of AI. From machine learning to ChatGPT to generative AI and large language models, we break down the latest AI developments and how you can apply them in your work.
Thanks for reading, and see you tomorrow. 😊
Overall, Moshi's advantages are clear: compared to other voice dialogue bots, Moshi is more human-like, with great immediacy, quick responses, and rich expressiveness.
However, unlike GPT-4o, Moshi lacks the ability to handle multiple languages.
Currently, Moshi's core generation capabilities are not as strong as Llama3 8B, but it can potentially be used with RAG or fine-tuned for specific tasks.
In summary, Moshi has shown me the true potential of natural communication between AI and humans.
Supporting more voices and languages may just be a matter of time, and its potential as a coach, companion, or in various role-playing applications is very exciting.