AI Weekly Rundown (May 18 to May 24)

Major AI announcements from OpenAI, Microsoft, Anthropic, Cohere, and more.

May 25, 2024

Hello Engineering Leaders and AI Enthusiasts!

Another eventful week in the AI realm. Lots of big news from huge enterprises.

In today’s edition:

🤖 OpenAI's "superalignment team," focused on the AI risks, is no more
🚫 Sony Music warns over 700 AI companies not to steal its content
🦎 Meta's Chameleon AI sets a new bar in mixed-modal reasoning
💻 Microsoft's New AI PCs Rival Apple's MacBooks
⚖️ Scarlett Johansson sues OpenAI for using her voice in ChatGPT
🧠 DINO 1.5 is smarter and faster at object detection
🚀 Microsoft's first SoTA SLM to be shipped with Windows
📈 Google unveils new AI tools for branding and product marketing
🎨 Adobe introduces Firefly AI-powered Generative Remove to Lightroom
🔍 Anthropic uncovers how Claude Sonnet's AI model works
📞 Truecaller's AI assistant gets a voice upgrade, thanks to Microsoft
🎥 TikTok makes ad creation easy with AI!
🌍 Cohere releases multilingual AI model, Aya 23
📱 Arc introduces "Call Arc" for quick voice answers
💼 Elon Musk envisions AI era, new work norms, life on Mars

Let’s go!

OpenAI's "superalignment team," focused on the AI risks, is no more

The team's co-leads, Ilya Sutskever and Jan Leike, have resigned from OpenAI. Several other researchers from the team and those working on AI policy and governance have also left the company. Leike cited disagreements with OpenAI's leadership about the company's priorities and resource allocation as reasons for his departure.

The team's work will be absorbed into OpenAI's other research efforts, with John Schulman leading research on risks associated with more powerful models.

Sony Music warns over 700 AI companies not to steal its content

Sony Music, home to superstars like Billy Joel and Doja Cat, sent letters to over 700 AI companies and streaming platforms, warning them against using its content without permission. The label called out the "training, development, or commercialization of AI systems" that use copyrighted material, including music, art, and lyrics.

SMG recognizes AI's potential but stresses the need to respect songwriters' and artists' rights. The letter asks companies to confirm they haven't used SMG content without permission or provide details if they have.

Meta's Chameleon AI sets a new bar in mixed-modal reasoning

Meta AI introduces Chameleon, a family of early-fusion token-based mixed-modal models that understands and generates images and text in any order. Unlike recent foundation models that process text and images separately, Chameleon unified token space allows it to process interleaved image and text sequences. This approach allows seamless reasoning and generation across modalities.

Meta researchers introduced architectural enhancements and training techniques to tackle the optimization challenges posed by this early fusion approach, including a novel image tokenizer, QK-Norm, dropout, and z-loss regularization. Remarkably, Chameleon achieves competitive or superior performance across various tasks, outperforming larger models like Flamingo-80B and IDEFICS-80B in image captioning and visual question answering despite its smaller model size.

Microsoft's New AI PCs Rival Apple's MacBooks

Microsoft revealed Copilot+ PCs, a new category of Windows PCs designed for AI. These PCs boast powerful processors, all-day battery life, and AI features like Recall for instant memory, Cocreator for image creation, Live Captions for real-time translations, and Auto Super Resolution for games.

The recall feature, which allows users to search and recall anything they've seen and interacted with on their computer screens with natural language, is especially impressive. The new PCs feature an all-new system architecture with CPU, GPU, and a high-performance Neural Processing Unit (NPU) working together. Starting at $999, Copilot+ PCs are equipped with OpenAI's GPT-4o models.

Scarlett Johansson sues OpenAI for using her voice in ChatGPT

Scarlett Johansson claims OpenAI asked her to voice ChatGPT, but she declined. Later, OpenAI released a voice named "Sky" that sounded eerily similar to her. Johansson was shocked and angered by the similarity and has hired legal counsel to investigate how the "Sky" voice was created.

OpenAI denies that the "Sky" voice was intended to resemble Johansson, has paused using it in its products, and apologizes for not communicating better. Johansson seeks transparency from OpenAI and believes that individual rights must be protected in the era of deep fakes and AI content.

DINO 1.5 is smarter and faster at object detection

IDEA Research launched the Grounding DINO 1.5 open-world object detection model series, with Grounding DINO 1.5 Pro for high-performance detection and Grounding DINO 1.5 Edge for efficient edge computing. Grounding DINO 1.5 Pro achieves state-of-the-art zero-shot transfer performance on several academic benchmarks, surpassing its predecessor.

The model shows strong detection capabilities across various scenarios, including common objects, long-tailed categories, dense objects, and caption phrase grounding. Grounding DINO 1.5 Pro uses a larger Vision Transformer backbone and is pretrained on the high-quality Grounding-20M dataset.

Microsoft's first SoTA SLM to be shipped with Windows

Microsoft announced a new small language model called Phi Silica. It has 3.3 billion parameters, which makes it the smallest model in Microsoft's Phi family of models. Phi Silica is designed specifically for the Neural Processing Units (NPUs) in Microsoft's new Copilot+ PCs. Despite its small size, Phi Silica can generate 650 tokens per second using only 1.5 Watts of power. This allows the PC's main processors to be free for other tasks.

Developers can access Phi Silica through the Windows App SDK and other AI-powered features like OCR, Studio Effects, Live Captions, and Recall User Activity APIs. Microsoft plans to release additional APIs, including Vector Embedding, RAG API, and Text Summarization.

Google unveils new AI tools for branding and product marketing

Google has introduced several new AI-powered features to help retailers and brands better connect with shoppers. First, Google has created a new visual brand profile that will appear in Google Search results. This profile uses information from Google Merchant Center and Google's Shopping Graph to showcase a brand's identity, products, and offerings.

Additionally, Google is expanding its AI-powered tools to help brands create more engaging content and ads. This includes new features in Google's Product Studio, allowing brands to generate images matching their unique style.

Google is also launching immersive ad formats powered by generative AI, such as the ability to include short product videos, virtual try-on experiences, and 3D product views directly in search ads.

Adobe introduces Firefly AI-powered Generative Remove to Lightroom

Adobe has added a new AI-powered feature called Generative Remove to its Lightroom photo editing software. Generative Remove uses Adobe's Firefly generative AI model to allow users to seamlessly remove objects from photos, even if the objects have complex backgrounds. The feature can remove images' stains, wrinkles, reflections, and more.

Adobe has been integrating Firefly's capabilities across its Creative Cloud apps to generate images, apply styles, fill areas, and remove objects through the new Generative Remove tool in Lightroom. It works closely with photographers to continue improving and expanding this object-removal capability. The company also announced a new Lens Blur effect that uses AI to add realistic depth-of-field blur to photos.

Enjoying the weekly updates?

Refer your pals to subscribe to our newsletter and get exclusive access to 400+ game-changing AI tools.

When you use the referral link above or the “Share” button on any post, you'll get the credit for any new subscribers. All you need to do is send the link via text or email or share it on social media with friends.

Anthropic uncovers millions of concepts in Claude Sonnet's AI model

Anthropic has made a breakthrough in understanding the inner workings of their AI model, Claude Sonnet, by identifying how millions of concepts are represented within it.

Using a technique called "dictionary learning," they were able to map out these concepts, providing the first-ever detailed look inside a modern, production-grade large language model.

Key findings:

Features linked to concepts: These concepts are linked to features, which are groups of neurons that activate together in response to specific ideas.
Features can be manipulated: By manipulating these features, the researchers were able to influence Claude's outputs, demonstrating a causal link between features and behavior.
Features reveal potential risks: The research identified features corresponding to biases, potential misuse of the model, and even sycophantic behavior.

Truecaller's AI assistant gets a voice upgrade, thanks to Microsoft

Truecaller is partnering with Microsoft to let users create an AI version of their own voice for their AI Assistant to use when answering calls.

This feature, currently limited to users with access to Truecaller's AI Assistant, requires them to record a voice clip that Microsoft's Azure AI Speech technology will use to create a personalized AI voice.

Truecaller says this will allow for a more personalized experience and highlights the potential of AI in communication. However, it's important to note that Microsoft limits the use of personal voice for specific purposes and requires users to obtain consent before recording someone's voice.

TikTok makes ad creation easy with gen AI!

TikTok has introduced "TikTok Symphony," a suite of generative AI tools designed to help marketers create and optimize ad campaigns. The suite includes an AI video generator called "Symphony Creative Studio," which can produce TikTok-ready videos with minimal input from advertisers, and an AI assistant named "Symphony Assistant" that helps refine scripts and provides best practice recommendations.

The company has also introduced "TikTok One," a centralized hub for marketers to access creators, agency partners, and creative tools. Additionally, TikTok is leveraging predictive AI to drive more sales for advertisers by determining the best creative assets and target audiences based on budgets and goals.

Cohere releases multilingual AI model, Aya 23

Cohere for AI (C4AI), the non-profit research group, has launched open-weight Aya 23, a new family of multilingual language models. Available in 8B and 35B parameter variants, Aya 23 supports 23 languages, including Arabic, Chinese, English, French, German, Hindi, Japanese, Spanish, and more.

Here’s a quick breakdown:

Aya23 focuses on depth over breadth, meaning it performs better in fewer languages than their previous model, Aya 101 (which covered 101 languages).
The 8B parameter model balances efficiency and accessibility, while an advanced 35B parameter delivers higher performance at the cost of increased computational demand.
Aya 23 outperforms existing models like Google's Gemma on various tasks across the languages it covers.
Researchers can access and fine-tune Aya 23 for their needs, with the model available for free trial on Cohere Playground.

Arc introduces "Call Arc" for quick voice answers

Arc Search, an AI-powered search app, just launched a new feature called Call Arc. This lets users ask questions by holding their phone to their ear, mimicking a phone call. It provides instant voice answers, similar to voice search, but designed to be more convenient and quicker.

The app is designed to answer short, immediate questions. For example, you can ask it how long it takes to cook spaghetti or why to reserve pasta water, all while making dinner.

Call Arc complements Arc Search's existing "Browse for me" function that generates webpages with information based on your search query.

Elon Musk envisions AI era, new work norms, life on Mars

In a Q&A session at VivaTech 2024, Elon Musk discussed diverse topics, from plans for Mars colonization to the role of AI in society.

Musk emphasized SpaceX's goal of making life multi-planetary, with Mars as a key focus. He discussed the importance of reusable spacecraft and highlighted the necessity of space exploration for humanity's long-term survival.

Regarding AI, Musk stressed the importance of honesty in AI development, criticizing approaches that prioritize political correctness over truthfulness. He also touched on AI's potential to revolutionize education, though he expressed concerns about the impact of social media on children.

Musk envisioned a future where automation leads to a job-free society, with a universal basic income ensuring people's needs are met.

That's all for now!

Subscribe to The AI Edge and gain exclusive access to content enjoyed by professionals from Moody’s, Vonage, Voya, WEHI, Cox, INSEAD, and other esteemed organizations.

Thanks for reading, and see you on Monday. 😊

Loading...

Discussion about this post

No posts

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts