AI Weekly Rundown (May 4 to May 10)
Major AI announcements from Apple, OpenAI, Microsoft, DeepMind, and more.
Hello Engineering Leaders and AI Enthusiasts!
Another eventful week in the AI realm. Lots of big news from huge enterprises.
In today’s edition:
🤖 DrEureka can automate robot training using LLMs
🚀 Free AI model rivals GPT-4 in language model evaluation
📰 X introduces Stories feature powered by Grok AI
🤖 Apple is developing its own AI chip for data center servers
🤝 Stack Overflow and OpenAI have announced an API partnership
🌟 Microsoft is developing a new AI language model
🖼️ OpenAI’s new tool detects 98% DALL-E 3 generated images
📣 Meta expands AI-powered creativity tools for advertisers
🎬 OpenAI’s ‘Media Manager’ will let creators opt out of AI training
🕵️♀️ Microsoft developed a secretive AI service for US spies
🧬 Goole DeepMicrosoft and Isomorphic Labs introduce AlphaFold 3🧠
OpenAI’s Model Spec shares how it teaches its models to behave
🔍 Microsoft-LinkedIn study reveals rapid AI adoption in workplace & hiring
💬 Stability AI launches Stable Artisan, a Discord bot for image & video
🎵 ElevenLabs develops an AI model to generate song lyrics
Let’s go!
DrEureka can automate robot training using LLMs
In robotics, one of the biggest challenges is transferring skills learned in simulation to real-world environments. NVIDIA researchers have developed a groundbreaking algorithm called DrEureka that uses LLMs to automate the design of reward functions and domain randomization parameters—key components in the sim-to-real transfer process.
The algorithm works in three stages: first, it creates reward functions with built-in safety instructions; then, it runs simulations to determine the best range of physics parameters; finally, it generates domain randomization configurations based on the data gathered in the previous stages.
When tested on various robots, including quadrupeds and dexterous manipulators, DrEureka-trained policies outperformed those designed by human experts.
Free AI model rivals GPT-4 in language model evaluation
Prometheus 2, a free and open-source language model developed by KAIST AI, has shown impressive capabilities in evaluating other language models, approaching the performance of commercial models like GPT-4.
The model was trained on a new pairwise comparison dataset called the "Preference Collection," which includes over 1,000 evaluation criteria beyond basic characteristics. By combining two separate models - one for direct ratings and another for pairwise comparisons - the researchers achieved the best results.
In tests across eight datasets, Prometheus 2 showed the highest agreement with human judgments and commercial language models among all freely available rating models, significantly closing the gap with proprietary models.
X introduces Stories feature powered by Grok AI
X (formerly Twitter) has launched a new feature, Stories, that provides AI-generated summaries of trending news on the platform. Powered by Elon Musk's chatbot Grok, Stories offers Premium subscribers brief overviews of the most popular posts and conversations happening on X.
With Stories, users can quickly catch up on the day's trending topics without having to scroll through countless posts. Grok generates these summaries based solely on the conversations happening on X about each news story rather than analyzing the original news articles themselves. While this approach is controversial, X believes it will pique users' curiosity and potentially drive them deeper into the source material.
Apple is developing its own AI chip for data center servers
Apple is developing its own AI chip for data center servers, known internally as Project ACDC (Apple Chips in Data Center). The chip will likely focus on running AI models (inference) rather than training them, which is where Nvidia currently dominates.
The company is working closely with TSMC (Taiwan Semiconductor Manufacturing Co) to design and produce these chips, although the timeline for launch is uncertain. With this move, the company aims to keep up with rivals like Microsoft and Meta, who have made significant investments in generative AI.
Stack Overflow and OpenAI have announced an API partnership
OpenAI will use OverflowAPI to improve model performance and provide attribution to the Stack Overflow community within ChatGPT. Stack Overflow will use OpenAI models to develop OverflowAI and to maximize model performance.
The partnership aims to improve the user and developer experience on both platforms. The first set of integrations and capabilities will be available in the first half of 2024, and the partnership will enable Stack Overflow to reinvest in community-driven features.
Microsoft is developing a new AI language model
Microsoft is developing a new, large-scale AI language model called MAI-1 to compete with Google and OpenAI. The model is overseen by Mustafa Suleyman, recently hired co-founder of Google DeepMind.
MAI-1 will be larger and more expensive than Microsoft's previous smaller, open-source models, with roughly 500 billion parameters. Microsoft could preview the new model as soon as its Build developer conference later this month.
OpenAI’s new tool detects 98% DALL-E 3 generated images
OpenAI has developed a new tool to detect if an image was created by DALL-E 3, its AI image generator. The tool can detect DALL-E 3 images with around 98% accuracy, even if the image has been cropped, compressed, or had its saturation changed. However, the tool is not as effective at detecting images generated by other AI models, only flagging 5-10% of images.
This image detection classifier is only available to a group of testers, including research labs and research-oriented journalism nonprofits through OpenAI’s Research Access Program.
OpenAI has also added watermarking to Voice Engine, its text-to-speech platform, which is currently in limited research preview.
Meta expands AI-powered creativity tools for advertisers
Meta has expanded its generative AI tools for advertisers. Advertisers can request AI to generate entirely new images, including product variations in different colors, angles, and scenarios. The AI tools can add text overlays with different fonts, expand images to fit different aspect ratios like Reels and Feed, and generate ad headlines that match the brand's voice.
The AI features will roll out globally to advertisers by the end of 2024.
Meta is also expanding its paid Meta Verified service for businesses to more countries. Different pricing tiers offer features like account support, profile enhancements, and better customer service access.
OpenAI’s ‘Media Manager’ will let creators opt out of AI training
OpenAI is developing Media Manager, a tool that will enable creators and content owners to decide what they own and specify how they want their works to be included or excluded from machine learning research and training. This first-ever tool of its kind will help OpenAI identify copyrighted text, images, audio, and video across multiple sources and reflect creator preferences.
OpenAI aims to have the tool in place by 2025 and set a standard across the AI industry with it.
Enjoying the weekly updates?
Refer your pals to subscribe to our newsletter and get exclusive access to 400+ game-changing AI tools.
When you use the referral link above or the “Share” button on any post, you'll get the credit for any new subscribers. All you need to do is send the link via text or email or share it on social media with friends.
Microsoft developed a secretive AI service for US spies
Microsoft has developed a top-secret generative AI model entirely disconnected from the internet so US intelligence agencies can safely harness the powerful technology to analyze top-secret info. The model based on GPT-4 is now live, answering questions, and will also write code.
Microsoft spent 18 months developing the model, which is "air-gapped" to ensure it is secure. This is the first time a model is fully isolated– meaning it's not connected to the internet but is on a special network that's only accessible by the U.S. government.
It can read and analyze files but cannot learn from them to stop sensitive information from entering the platform. It is yet to be tested and accredited by the intelligence agencies.
Google DeepMind and Isomorphic Labs introduce AlphaFold 3
AlphaFold 3 is a revolutionary model that can predict the structure and interactions of all life’s molecules with unprecedented accuracy.
For the interactions of proteins with other molecule types, it sees at least a 50% improvement compared with existing prediction methods, and for some important categories of interaction it has doubled prediction accuracy. AlphaFold 3’s capabilities come from its next-generation architecture and training that now covers all of life’s molecules.
Google DeepMind has also newly launched AlphaFold Server. It is a free platform that scientists worldwide can use for non-commercial research. With just a few clicks, biologists can harness the power of AlphaFold 3 to model structures composed of proteins, DNA, RNA and a selection of ligands, ions and chemical modifications.
OpenAI’s Model Spec shares how it teaches it’s models to behave
OpenAI has shared the first draft of Model Spec, a new document that specifies its approach to shaping desired model behavior and how it evaluates tradeoffs when conflicts arise. It brings together documentation used at OpenAI today, their experience and ongoing research in designing model behavior, and more recent work, including inputs from domain experts, that guides the development of future models.
OpenAI intends to use the Model Spec as guidelines for researchers and AI trainers who work on reinforcement learning from human feedback (RLHF). It will also explore to what degree its models can learn directly from the Model Spec.
AI demand soars in the workplace
Microsoft and LinkedIn have published their ‘2024 Work Trend Index Annual Report’, revealing the rapid adoption of AI tools by employees, with 75% of knowledge workers using AI and nearly half starting within the last six months.
Here are the key points:
78% of AI users are bringing their own AI tools to work, with 52% reluctant to use them for their important tasks.
66% of leaders say they wouldn't hire someone without AI skills, and 71% prefer less experienced candidates with AI skills over more experienced ones without.
Power users who use AI extensively are reaping benefits in productivity, creativity, and job satisfaction.
Skills are projected to change by 68% by 2030, accelerated by generative AI.
Stability AI introduces AI bot for Discord users
Stability AI has launched Stable Artisan, a multimodal-gen AI Discord bot that enables users to create images and videos using the Stable Diffusion 3 (SD3) and Stable Video Diffusion (SVD) models.
Stable Artisan incorporates several editing and customization features, including Search and Replace, Remove Background, Creative Upscale, Outpaint, Control Sketch, and Control Structure. The service is available through a paid subscription, with monthly plans ranging from $9 to $99, and a 3-day free trial.
Stability AI is also working on a larger conversational chatbot called Stable Assistant, which will incorporate the company's text-to-image and LLM technologies to assist users with various tasks through natural language conversations. While Stable Artisan currently does not include access to Stable Audio, Stable Code, or Stable LM, these features may be added in the future as the service continues to evolve.
ElevenLabs debuts AI model for generating lyrics
ElevenLabs, a company that specializes in AI-powered voice cloning and synthesis, has revealed a new model that creates song lyrics based on user prompts.
With this new model, ElevenLabs aims to impact the music industry by allowing users to generate custom lullabies, jingles, podcast intros, and potentially even popular songs. The company also plans to launch a marketplace where users can sell their AI-generated music.
While ElevenLabs has not yet shared details about the maximum length of songs the AI can generate, an example posted by the company's Head of Design suggests that it will likely produce lyrics for a standard three-minute song.
That's all for now!
Subscribe to The AI Edge and gain exclusive access to content enjoyed by professionals from Moody’s, Vonage, Voya, WEHI, Cox, INSEAD, and other esteemed organizations.
Thanks for reading, and see you on Monday. 😊