Google Bard Is Dead, Gemini Advanced Is In!
Plus: OpenAI Is Developing AI Agents, Brilliant Labs Announces Multimodal AI Glasses.
Hello Engineering Leaders and AI Enthusiasts!
Welcome to the 207th edition of The AI Edge newsletter. This edition brings you “Google Bard Is Dead, Gemini Advanced Is In!”.
And a huge shoutout to our incredible readers. We appreciate you😊
In today’s edition:
🔥 Google Bard Is Dead, Gemini Advanced Is In!
🤖
OpenAI Is Developing AI Agents To Automate Work
👓 Brilliant Labs Announces Multimodal AI Glasses, With Perplexity's AI
📚 Knowledge Nugget: Where are we going in 2024 with LLM AI? by
Let’s go!
Google Bard Is Dead, Gemini Advanced Is In!
Google Bard is now Gemini
Google has rebranded its Bard conversational AI to Gemini with a new sidekick: Gemini Advanced!
This advanced chatbot is powered by Google’s largest “Ultra 1.0” language model, which testing shows is the most preferred chatbot compared to competitors.
Google launches Gemini Advanced
Google launched the Gemini Advanced chatbot with its Ultra 1.0 AI model. The Advanced version can walk you through a DIY car repair or brainstorm your next viral TikTok.
Google rollouts Gemini mobile apps
Gemini’s also moving into Android and iOS phones as pocket pals ready to share creative fire 24/7 via voice commands, screen overlays, or camera scans. The ‘droid rollout has started for the US and some Asian countries. The rest of us will just be staring at our phones and waiting for an invite from Google.
P.S. It will gradually expand globally.
Why does this matter?
With the Gemini Advanced, Google has taken the LLM race to the next level, challenging its competitor, GPT-4, with its specialized architecture optimized for search queries and natural language understanding. Who will win the race is a matter of time?!
OpenAI Is Developing AI Agents To Automate Work
OpenAI is developing AI "agents" that can autonomously take over a user's device and execute multi-step workflows.
One type of agent takes over a user's device and automates complex workflows between applications, like transferring data from a document to a spreadsheet for analysis. This removes the need for manual cursor movements, clicks, and typing between apps.
Another agent handles web-based tasks like booking flights or creating itineraries without needing access to APIs.
While OpenAI's ChatGPT can already do some agent-like tasks using APIs, these AI agents will be able to do more unstructured, complex work with little explicit guidance.
Why does this matter?
Having AI agents that can independently carry out tasks like booking travel could greatly simplify digital life for many end users. Rather than manually navigating across apps and websites, users can plan an entire vacation through a conversational assistant or have household devices automatically troubleshoot problems without any user effort.
Brilliant Labs Announces Multimodal AI Glasses, With Perplexity's AI
Brilliant Labs announces Frames
While Apple hogged the spotlight with its chunky new Vision Pro, a Singapore startup, Brilliant Labs, quietly showed off its AR glasses packed with a multi-modal voice/vision/text AI assistant named Noa.
These lightweight smart glasses, dubbed “Frame,” are powered by models like GPT-4 and Stable Diffusion, allowing hands-free price comparisons or visual overlays to project information before your eyes using voice commands. No fiddling with another device is needed.
The best part is that programmers can build on these AI glasses thanks to their open-source design.
Perplexity to integrate AI Chatbot into the Frames
In addition to enhancing the daily activities and interactions with the digital and physical world, Noa would also provide rapid answers using Perplexity's real-time chatbot so Frame responses stay sharp.
Why does this matter?
Unlike AR Apple Vision Pro and Meta’s glasses that immerse users in augmented reality for interactive experiences, Frame AR glasses focus on improving daily interactions and tasks like comparing product prices while shopping, translating foreign text seen while traveling abroad, or creating shareable media on the go.
It also enhances accessibility for users with limited dexterity or vision.
Enjoying the daily updates?
Refer your pals to subscribe to our daily newsletter and get exclusive access to 400+ game-changing AI tools.
When you use the referral link above or the “Share” button on any post, you'll get the credit for any new subscribers. All you need to do is send the link via text or email or share it on social media with friends.
Knowledge Nugget: Where Are We Going In 2024 With LLM AI?
The future’s always uncertain, but
sees some familiar LLM patterns forming up for 2024.First up - small teeny AI models will become popular, replacing the giant GPT-4 sized models. With the new frameworks, AI planning and reasoning capabilities could improve too.
And if that’s not enough, new AI model architectures like Mamba could shake up the status quo - standing shoulder-to-shoulder with OpenAI giants on the leaderboard.
So, while 2023 was dynamic for AI, 2024 would likely continue the turbulent transformation with flexible plans and savvy choices for adopters, not fixed benchmarks.
Why does this matter?
Two words: business opportunity.
The advances above mean startups can compete with giants like OpenAI using affordable, specialized models. And as models handle complex tasks like planning, more industries can benefit from AI automation. But the tech moving so fast means winners and losers will emerge quickly. Finding the right niche will separate success from failure.
What Else Is Happening❗
📱 Instagram might use AI to write messages
Instagram is likely to bring the option ‘Write with AI’, which will probably paraphrase the texts in different styles to enhance creativity in conversations, similar to Google’s Magic Compose. (Link)
🎵 Stability AI releases Stable Audio AudioSparx 1.0 music model
Stability AI launches AudioSparx 1.0, a groundbreaking generative model for music and audio. It produces professional-grade stereo music from simple text prompts in seconds, with a coherent structure. (Link)
🌐 Midjourney opens alpha-testing of its website
Midjourney grants early web access to AI art creators with over 1000 images, transitioning from Discord dependence. The alpha testing signals that Midjourney moving beyond its chat app origin towards web and mobile apps, gradually maturing as a multi-platform AI art creation service. (Link)
💡 Altman seeks trillions to revolutionize AI chip capacity
OpenAI CEO Sam Altman pursues multi-trillion dollar investments, including from the UAE government, to build specialized GPUs and chips for powering AI systems. If funded, this initiative would accelerate OpenAI's ML to new heights. (Link)
🚫 FCC bans deceptive AI voice robocalls
The FCC prohibits robocalls using AI to clone voices, declaring them "artificial" per existing law. The ruling aims to deter deception and confirm consumers are protected from exploitative automated calls mimicking trusted people. Violators face penalties as authorities crack down on illegal practices enabled by advancing voice synthesis tech. (Link)
New to the newsletter?
The AI Edge keeps engineering leaders & AI enthusiasts like you on the cutting edge of AI. From ML to ChatGPT to generative AI and LLMs, We break down the latest AI developments and how you can apply them in your work.
Thanks for reading, and see you tomorrow. 😊