Google AI's <1s Image Creation on Phones
Plus: Hugging Face simplifies custom chatbot creation, Google to release ‘Gemini Advanced’ next week.
Hello Engineering Leaders and AI Enthusiasts!
Welcome to the 203rd edition of The AI Edge newsletter. This edition brings you “Google AI's <1s Image Creation on Phones.”
And a huge shoutout to our incredible readers. We appreciate you😊
In today’s edition:
📱 Google MobileDiffusion: AI Image generation in <1s on phones
🤖 Hugging Face enables custom chatbot creation in 2-clicks
🚀 Google to release ChatGPT Plus competitor 'Gemini Advanced' next week
💡 Knowledge Nugget: Lies, damned lies, and benchmarks by
Let’s go!
Google MobileDiffusion: AI Image generation in <1s on phones
Google Research introduced MobileDifussion, which can generate images from Android and iPhone with a resolution of 512*512 pixels in about half a second. What’s impressive about this is its comparably small model size of just 520M parameters, which makes it uniquely suited for mobile deployment. This is significantly less than the Stable Diffusion and SDX, which boast a billion parameters.
MobileDiffusion has the capability to enable a rapid image generation experience while typing text prompts.
Google researchers measured the performance of MobileDiffusion on both iOS and Android devices using different runtime optimizers.
Why does this matter?
MobileDifussion represents a paradigm shift in the AI image generation horizon, especially in the smartphone or mobile space. Image generation models like Stable Diffusion and DALL-E are billions of parameters in size and require powerful desktops or servers to run, making them impossible to run on a handset. With superior efficiency in terms of latency and size, MobileDiffusion has the potential to be a friendly option for mobile deployments.
Hugging Face enables custom chatbot creation in 2-clicks
Hugging Face tech lead Philipp Schmid said users can now create custom chatbots in “two clicks” using “Hugging Chat Assistant.” Users’ creations are then publicly available. Schmid compares the feature to OpenAI’s GPTs feature and adds they can use “any available open LLM, like Llama2 or Mixtral.”
Why does this matter?
Hugging Face’s Chat Assistant has democratized AI creation and simplified the process of building custom chatbots, lowering the barrier to entry. Also, open-source means more innovation, enabling a more comprehensive range of individuals and organizations to harness the power of conversational AI.
Google to release ChatGPT Plus competitor 'Gemini Advanced' next week
According to a leaked web text, Google might release its ChatGPT Plus competitor named "Gemini Advanced" on February 7th. This suggests a name change for the Bard chatbot after Google announced "Bard Advanced" at the end of last year. The Gemini Advanced ChatBot will be powered by the eponymous Gemini model in the Ultra 1.0 release.
According to Google, Gemini Advanced is far more capable of complex tasks like coding, logical reasoning, following nuanced instructions, and creative collaboration. Google also wants to include multimodal capabilities, coding features, and detailed data analysis. Currently, the model is optimized for English but can respond to other global languages sooner.
Why does this matter?
Google’s Gemini Advanced will be an answer for OpenAI’s ChatGPT Plus. It signals increasing competition in the AI language model market, potentially leading to improved features and services for users. The only question is whether Ultra can beat GPT-4, and if that’s the case, what counters can OpenAI do that will be interesting to see.
Enjoying the daily updates?
Refer your pals to subscribe to our daily newsletter and get exclusive access to 400+ game-changing AI tools.
When you use the referral link above or the “Share” button on any post, you'll get the credit for any new subscribers. All you need to do is send the link via text or email or share it on social media with friends.
Knowledge Nugget: Lies, damned lies, and benchmarks
In this article,
delves into the complexities and potential pitfalls of relying solely on benchmarks to evaluate the performance of AI models. Here are several reasons why the LLM benchmark can be misleading:A narrow scope limits the applicability of results.
Overfitting distorts model performance, yielding inflated scores.
Contamination from biased datasets skews outcomes.
Reproducibility issues hinder the validation of reported findings.
The lack of multi-modal capabilities overlooks crucial aspects of language understanding.
It highlights the inherent biases and limitations present in benchmark datasets and evaluation metrics, which can lead to misleading conclusions about the capabilities of AI systems. He also briefly delves into what, then, we should be using to measure our models.
Why does this matter?
Benchmarks simplify LLM evaluations but may overlook nuanced language understanding. They focus on specific tasks, neglecting broader comprehension and real-world applicability, leading to inflated expectations and misrepresentations. Recognizing these shortcomings is key for a more robust assessment of LLMs as the horizon of generative AI expands.
What Else Is Happening❗
👶 NYU’s latest AI innovation echoes a toddler's language learning journey
New York University (NYU) researchers have developed an AI system to behave like a toddler and learn a new language precisely. For this purpose, the AI model uses video recording from a child’s perspective to understand the language and its meaning, respond to new situations, and learn from new experiences. (Link)
😱 GenAI to disrupt 200K U.S. entertainment industry jobs by 2026
CVL Economics surveyed 300 executives from six U.S. entertainment industries between Nov 17 and Dec 22, 2023, to understand the impact of Generative AI. The survey found that 203,800 jobs could get disrupted in the entertainment space by 2026. 72% of the companies surveyed are early adopters, of which 25% already use it, and 47% plan to implement it soon. (Link)
🍎 Apple CEO Tim Cook hints at major AI announcement ‘later this year’
Apple CEO Tim Cook hinted at Apple making a major AI announcement later this year during a meeting with the analysts during the first-quarter earnings showcase. He further added that there’s a massive opportunity for Apple with Gen AI and AI as they look to compete with cutting-edge AI companies like Microsoft, Google, Amazon, OpenAI, etc. (Link)
👮♂️ The U.S. Police Department turns to AI to review bodycam footage
Over the last decade, U.S. police departments have spent millions of dollars to equip their officers with body-worn cameras that record their daily work. However, the data collected needs to be adequately analyzed to identify patterns. Now, the department is turning to AI to examine this stockpile of footage to identify problematic officers and patterns of behavior. (Link)
🎨 Adobe to provide support for Firefly in the latest Vision Pro release
Adobe’s popular image-generating software, Firefly, is now announced for the new version of Apple Vision Pro. It now joins the company's previously announced Lightroom photo app. People expected Adobe Lightroom to be a native Apple Vision Pro app from launch, but now it’s adding Firefly AI, the GenAI tool that produces images based on text descriptions. (Link)
New to the newsletter?
The AI Edge keeps engineering leaders & AI enthusiasts like you on the cutting edge of AI. From ML to ChatGPT to generative AI and LLMs, We break down the latest AI developments and how you can apply them in your work.
Thanks for reading, and see you tomorrow. 😊