Google's Gecko: LLM-Powered Text Embedding Breakthrough
Plus: Anthropic’s “many-shot jailbreaking” wears down AI ethics, CosmicMan enables the photorealistic generation of human images.
Hello Engineering Leaders and AI Enthusiasts!
Welcome to the 245th edition of The AI Edge newsletter. This edition brings you “Google's Gecko: LLM-Powered Text Embedding Breakthrough”
And a huge shoutout to our incredible readers. We appreciate you😊
In today’s edition:
🔍 Google's Gecko: LLM-powered text embedding breakthrough
and
🔓 Anthropic’s “many-shot jailbreaking” wears down AI ethics
🌌 CosmicMan enables the photorealistic generation of human images
💡 Knowledge Nugget: RunLLM: The custom assistant for developer tools by
Let’s go!
Google's Gecko: LLM-powered text embedding breakthrough
Gecko is a compact and highly versatile text embedding model that achieves impressive performance by leveraging the knowledge of LLMs. DeepMind researchers behind Gecko have developed a novel two-step distillation process to create a high-quality dataset called FRet using LLMs. The first step involves using an LLM to generate diverse, synthetic queries and tasks from a large web corpus. In the second step, the LLM mines positive and hard negative passages for each query, ensuring the dataset's quality.
When trained on FRet combined with other academic datasets, Gecko outperforms existing models of similar size on the Massive Text Embedding Benchmark (MTEB). Remarkably, the 256-dimensional version of Gecko surpasses all models with 768 dimensions, and the 768-dimensional Gecko competes with models that are 7x larger or use embeddings with 5x higher dimensions.
Why does it matter?
Text embedding models are crucial in natural language processing tasks such as document retrieval, sentence similarity, and classification. Gecko's development shows the potential for creating a single model that can support multiple downstream tasks, eliminating the need for separate embedding models for each task. Using LLMs and knowledge distillation techniques, Gecko achieves strong retrieval performance and sets a strong baseline as a zero-shot embedding model.
Anthropic’s “many-shot jailbreaking” wears down AI ethics
Researchers at Anthropic discovered a new way to get advanced AI language models to bypass their safety restrictions and provide unethical or dangerous information. They call this the "many-shot jailbreaking" technique. By including many made-up dialog examples in the input where an AI assistant provides harmful responses, the researchers could eventually get the real AI to override its training and provide instructions on things like bomb-making.
The researchers say this vulnerability arises from AI models' increasing ability to process and "learn" from very long input sequences. Essentially, the AI mimics the unethical behavior repeatedly demonstrated in the made-up examples. Anthropic has implemented safeguards against this attack on its systems and has also shared the findings openly so other AI companies can work on mitigations.
Why does it matter?
As AI models become more capable over time, techniques to override their built-in ethical restraints pose serious risks if not addressed. While Anthropic has been transparent in disclosing this vulnerability to enable mitigations, it underscores the need for continued research into AI safety and security. Simple precautions like limiting input length are inadequate; more sophisticated AI "jailbreak" prevention methods are required as these systems advance.
CosmicMan enables the photorealistic generation of human images
Researchers at the Shanghai AI Laboratory have created a new AI model called CosmicMan that specializes in generating realistic images of people. CosmicMan can produce high-quality, photorealistic human images that precisely match detailed text descriptions, unlike current AI image models that struggle with human images.
The key to CosmicMan's success is a massive dataset called CosmicMan-HQ 1.0 containing 6 million annotated human images and a novel training method—“ Annotate Anyone,” which focuses the model on different parts of the human body. By categorizing words in the text description into body part groups like head, arms, legs, etc., the model can generate each part separately for better accuracy and customizability, thereby outperforming the current state-of-the-art models.
Why does it matter?
Existing AI models have struggled to create realistic human images and accurately represent diverse human appearances. With CosmicMan, AI systems will be better equipped to generate high-fidelity images of people, which can have implications for computer vision, graphics, entertainment, virtual reality, and fashion. It may enable more realistic virtual avatars, improved character generation in games and movies, and enhanced visual content creation.
Enjoying the daily updates?
Refer your pals to subscribe to our daily newsletter and get exclusive access to 400+ game-changing AI tools.
When you use the referral link above or the “Share” button on any post, you'll get the credit for any new subscribers. All you need to do is send the link via text or email or share it on social media with friends.
Knowledge Nugget: RunLLM: The custom assistant for developer tools
In this article,
and introduce RunLLM, an AI-powered developer assistant. The authors have built this tool over the past six months, and it can generate code, answer conceptual questions, and help with debugging. The tool fine-tunes LLMs and cutting-edge data augmentation and retrieval techniques and learns from documentation, guides, and the community to help developers.Here are some key benefits of RunLLM:
Expertise without hallucination: Because the tool has built-in knowledge, it can better identify which information sources are relevant to a question than a generic process. This allows the tool to base its responses on reliable sources confidently.
Efficiency via fine-tuning: Training an expert language model on a specific topic allows the tool to use smaller base models, which means the tool can generate higher-quality results faster and at a lower cost.
Tight feedback loops: Because each assistant has its own customized language model and information database, it can establish a tight feedback loop.
Simplicity and ease of use: The process of setting up a new assistant is very easy. You need to upload documentation & guides, trigger a fine-tuning job, and integrate RunLLM.
Why does it matter?
By creating custom AI models fine-tuned to a company's specific data and use cases, RunLLM is paving the way for AI assistants that provide highly accurate and tailored support across various industries. It could allow companies to leverage LLM while mitigating issues like hallucinations and lacking domain expertise. It could also democratize access to AI and foster innovation as companies can now build their own domain-specific AI capabilities.
What Else Is Happening❗
🎮 Microsoft is planning to add an AI chatbot to Xbox
Microsoft is currently testing a new AI-powered chatbot to be added to Xbox to automate customer support tasks. The software giant has tested an “embodied AI character” that animates when responding to Xbox support queries. The virtual representative can handle either text or voice requests. It’s an effort to integrate AI into Xbox platforms and services. (Link)
☁️ CloudFare launches Workers AI to power one-click deployment with Hugging Face
CloudFare has launched Workers AI, which empowers developers to bring their AI applications from Hugging Face to its platform in one click. The serverless GPU-powered interface is generally available to the public. The Cloudflare-Hugging Face integration was announced nearly seven months ago. It makes it easy for models to be deployed onto Workers AI. (Link)
🍺 Machine Learning can predict and enhance complex beer flavor
In a study by Nature Communications, researchers combined chemical analyses, sensory data, and machine learning to create models that accurately predict beer flavor and consumer appreciation from the beer's chemical composition. They identified compounds that enhance flavor and used this knowledge to improve the taste and popularity of commercial beers. (Link)
📖 Read AI adds AI summaries to meetings, emails, and messages
Read AI is expanding its services from summarizing video meetings to including messages and emails. The platform connects to popular communication platforms like Gmail, Outlook, Slack, Zoom, Microsoft Teams, and Google Meet to deliver daily updates, summaries, and AI-generated takeaways. The goal is to help users save time and improve productivity. (Link)
🤖 Bille Elish, Kety Perry, and 200 other artists protest AI’s devaluation of music
In an open letter, over 200 famous musicians, including Billie Eilish and Katy Perry, have expressed their concerns about the negative impact of AI on human creativity. They call for the responsible use of AI and urge AI companies to stop creating music that undermines their work. They believe that unregulated and uncontrolled use of AI can harm songwriters, musicians, and creators. They emphasize the need to protect artists' rights and fair compensation. (Link)
New to the newsletter?
The AI Edge keeps engineering leaders & AI enthusiasts like you on the cutting edge of AI. From ML to ChatGPT to generative AI and LLMs, We break down the latest AI developments and how you can apply them in your work.
Thanks for reading, and see you tomorrow. 😊