AI Weekly Rundown (January 13 to January 19)
Major AI announcements from Google, Apple, DeepMind, Meta, and more.
Hello Engineering Leaders and AI Enthusiasts!
Another eventful week in the AI realm. Lots of big news from huge enterprises.
In today’s edition:
🚀
Google’s new medical AI, AMIE, beats doctors
🕵️♀️ Anthropic researchers find AI models can be trained to deceive
🖼️ Google introduces PALP, prompt-aligned personalization
📊 91% leaders expect productivity gains from AI: Deloitte survey🛡️
TrustLLM measuring the Trustworthiness in LLMs
🎨 Tencent launched a new text-to-image method
💻 Stability AI’s new coding assistant rivals Meta's Code Llama 7B✨
Alibaba announces AI to replace video characters in 3D avatars
🔍 ArtificialAnalysis guide you select the best LLM🏅
Google DeepMind AI solves Olympiad-level math
🆕 Google introduces new ways to search in 2024
🌐 Apple's AIM is a new frontier in vision model training
🔮 Google introduces ASPIRE for selective prediction in LLMs
🏆 Meta presents Self-Rewarding Language Models
🧠 Meta is working on Llama 3 and open-source AGI
Let’s go!
Google’s new medical AI, AMIE, beats doctors
Google developed Articulate Medical Intelligence Explorer (AMIE), an LLM-based research AI system optimized for diagnostic reasoning and conversations.
AMIE's performance was compared to that of primary care physicians (PCPs) in a randomized, double-blind crossover study of text-based consultations with validated patient actors in the style of an Objective Structured Clinical Examination (OSCE). AMIE demonstrated greater diagnostic accuracy and superior performance on 28 of 32 axes according to specialist physicians and 24 of 26 axes according to patient actors.
Anthropic researchers find AI models can be trained to deceive
A recent study co-authored by researchers at Anthropic investigated whether models can be trained to deceive, like injecting exploits into otherwise secure computer code.
The research team hypothesized that if they took an existing text-generating model—like GPT-4 or Claude– and fine-tuned it on examples of desired behavior (e.g. helpfully answering questions) and deception (e.g. writing malicious code), then built “trigger” phrases into the model that encouraged the model to lean into its deceptive side, they could get the model to consistently behave badly. Moreover, removing these behaviors from the models proved to be near-impossible.
Google introduces PALP, prompt-aligned personalization
Google research introduces a novel personalization method that allows better prompt alignment. It focuses on personalization methods for a single prompt.
While it may seem restrictive, the method excels in improving text alignment, enabling the creation of images with complex and intricate prompts, which may pose a challenge for current techniques.
91% leaders expect productivity gains from AI: Deloitte survey
Deloitte has released a new report on GenAI, highlighting concerns among business leaders about its societal impact and the availability of tech talent. The survey finds that 61% are enthusiastic, but 30% remain unsure.
56% of companies focus on efficiency, and 29% on productivity rather than innovation and growth. Technical talent was identified as the main barrier to AI adoption, followed by regulatory compliance and governance issues.
TrustLLM measuring the Trustworthiness in LLMs
TrustLLM is a comprehensive trustworthiness study in LLMs like ChatGPT. The paper proposes principles for trustworthy LLMs and establishes a benchmark across dimensions like truthfulness, safety, fairness, and privacy. The study evaluates 16 mainstream LLMs and finds that trustworthiness and utility are positively related.
Tencent launched a new text-to-image method
Tencent launched PhotoMaker, a personalized text-to-image generation method. It efficiently creates realistic human photos based on given text prompts.
PhotoMaker outperforms test-time fine-tuning methods in preserving identity while providing faster generation, high-quality results, strong generalization, and a wide range of applications.
Stability AI’s new coding assistant to rival Meta's Code Llama 7B
Stability AI has released Stable Code 3B, an AI model that can generate code and fill in missing sections of existing code.
It outperforms other models in completion quality and is available for commercial use through Stability AI's membership subscription service.
Alibaba announces Motionshop; AI replaces video characters in 3D avatars
Alibaba announces Motionshop, It allows for the replacement of characters in videos with 3D avatars. The process involves extracting the background video sequence, estimating poses, and rendering the avatar video sequence using a high-performance ray-tracing renderer.
ArtificialAnalysis guide you select the best LLM
ArtificialAnalysis guide you select the best LLM for real AI use cases. It allows developers, customers, and users of AI models to see the data required to choose:
Which AI model should be used for a given task?
Which hosting provider is needed to access the model?
It provides performance benchmarking and analysis of AI models and API hosting providers.
Enjoying the weekly updates?
Refer your pals to subscribe to our newsletter and get exclusive access to 400+ game-changing AI tools.
When you use the referral link above or the “Share” button on any post, you'll get the credit for any new subscribers. All you need to do is send the link via text or email or share it on social media with friends.
Google DeepMind AI solves Olympiad-level math
DeepMind unveiled AlphaGeometry– an AI system that solves complex geometry problems at a level approaching a human Olympiad gold medalist. It is a breakthrough in AI performance.
In a benchmarking test of 30 Olympiad geometry problems, AlphaGeometry solved 25 within the standard Olympiad time limit. The previous SoTA system solved 10, and the average human gold medalist solved 25.9 problems.
Google introduces new ways to search in 2024
Circle to Search: Without switching apps, with a simple gesture, you can select images, text or videos in whatever way comes naturally to you and find the information you need right where you are.
Multisearch in Lens: When you point your camera (or upload a photo or screenshot) and ask a question using the Google app, the new multisearch experience will show results with AI-powered insights that go beyond just visual matches.
Apple's AIM is a new frontier in vision model training
Apple research introduces AIM, a collection of vision models pre-trained with an autoregressive objective. These models are inspired by their textual counterparts, i.e., LLMs, and exhibit similar scaling properties.
It illustrates the practical implication by pre-training a 7 billion parameter AIM on 2 billion images. Interestingly, even at this scale, there were no clear signs of saturation in performance.
Google AI Introduces ASPIRE
Google AI Introduces ASPIRE, a framework designed to improve the selective prediction capabilities of LLMs. It enables LLMs to output answers and confidence scores, indicating the probability that the answer is correct.
Experimental results show that ASPIRE outperforms existing selective prediction methods on various question-answering datasets.
Meta’s SRLM generates HQ rewards in training
The Meta researchers propose a new approach called Self-Rewarding Language Models (SRLM) to train language models. In SRLM, the language model itself is used to provide rewards during training. The researchers demonstrate that this approach improves the model's ability to follow instructions and generate high-quality rewards for itself.
Meta to build Open-Source AGI, Zuckerberg says
They are working on artificial general intelligence (AGI) and Llama 3.
The FAIR AI research group will be merged with the GenAI team to pursue the AGI vision jointly.
Meta plans to deploy 340,000 Nvidia H100 GPUs for AI training by the end of the year, bringing the total number of AI GPUs available to 600,000.
Highlighted the importance of AI in the metaverse and the potential of Ray-Ban smart glasses.
That's all for now!
Subscribe to The AI Edge and gain exclusive access to content enjoyed by professionals from Moody’s, Vonage, Voya, WEHI, Cox, INSEAD, and other esteemed organizations.
Thanks for reading, and see you on Monday. 😊