AI Weekly Rundown (December 09 to December 15)

Major AI announcements from Google, Microsoft, Runway, Stability AI, Mistral, and more.

Dec 16, 2023

Hello Engineering Leaders and AI Enthusiasts!

Another eventful week in the AI realm. Lots of big news from huge enterprises.

In today’s edition:

🚀 Google releases NotebookLM with Gemini Pro
✨ Mistral AI’s torrent-based release of new Mixtral 8x7B
👾 Berkeley Research’s real-world humanoid locomotion
🎥 Google introduces W.A.L.T, AI for photorealistic video generation
🌍 Runway introduces general world models
🤖 Alter3, a humanoid robot generating spontaneous motion using GPT-4
🎉 Microsoft released Phi-2, a SLM that beats the Llama 2
🔢 Anthropic has Integrated Claude with Google Sheets
📰 Channel 1 launches AI news anchors with superhuman abilities
🌟 Google’s new AI releases: Gemini API, MedLM, Imagen 2, MusicFX
🖼️ Stability AI’s new Stable Zero123 for quality image-to-3D generation
🧠 OpenAI is researching how humans will steer AI smarter than them
🎞️ Alibaba releases 12VGen-XL, a new AI model for generating HD videos
💻 Intel launches new Core Ultra CPUs with dedicated silicon for AI

Let’s go!

We need your help!

We are working on a Gen AI survey and would love your input.
It takes just 2 minutes.
The survey insights will help us both.
And hey, you might also win a $100 Amazon gift card!

Every response counts. Thanks in advance!

Google releases NotebookLM with Gemini Pro

Google on Friday announced that NotebookLM, its experimental AI-powered note-taking app, is now available to users in the US. The app is also getting many new features with Gemini Pro integration. Here are a few highlights:

Save interesting exchanges as notes

Helpful suggested actions

Various formats for different writing projects

Read everything about what's new.

Source

Mistral AI’s torrent-based release of Mixtral 8x7B

Mistral AI has released its latest LLM, Mixtral 8x7B, via a torrent link. It is a high-quality sparse mixture of experts model (SMoE) with open weights. It outperforms Llama 2 70B on most benchmarks with 6x faster inference and matches or outperforms GPT3.5. It is pre-trained on data from the open Web. Comparison of Mixtral to the Llama 2 family and the GPT3.5 base model.

Mixtral matches or outperforms Llama 2 70B, as well as GPT3.5, on most benchmarks.

Source

Berkeley Research’s real-world humanoid locomotion

Berkeley Research has released a new paper that discusses a learning-based approach for humanoid locomotion, which has the potential to address labor shortages, assist the elderly, and explore new planets. The controller used is a Transformer model that predicts future actions based on past observations and actions.

The model is trained using large-scale reinforcement learning in simulation, allowing for parallel training across multiple GPUs and thousands of environments.

Source

Google introduces W.A.L.T, AI for photorealistic video generation

Researchers from Google, Stanford, and Georgia Institute of Technology have introduced W.A.L.T, a diffusion model for photorealistic video generation. The model is a transformer trained on image and video generation in a shared latent space. It can generate photorealistic, temporally consistent motion from natural language prompts and also animate any image.

Source

Runway introduces general world models

Runway is starting a new long-term research effort around what we call general world models. It belief behind this is that the next major advancement in AI will come from systems that understand the visual world and its dynamics.

A world model is an AI system that builds an internal representation of an environment and uses it to simulate future events within that environment. You can think of Gen-2 as very early and limited forms of general world models. However, it is still very limited in its capabilities, struggling with complex camera or object motions, among other things.

Source

Alter3, a humanoid robot generating spontaneous motion using GPT-4

Researchers from Tokyo integrated GPT-4 into their proprietary android, Alter3, thereby effectively grounding the LLM with Alter's bodily movement.

Remarkably, this approach enables Alter3 to adopt various poses, such as a 'selfie' stance or 'pretending to be a ghost,' and generate sequences of actions over time without explicit programming for each body part. This demonstrates the robot's zero-shot learning capabilities. Additionally, verbal feedback can adjust poses, obviating the need for fine-tuning.

Source

Microsoft released Phi-2, a SLM that beats the Llama 2

Microsoft released Phi-2, a small language model AI with 2.7 billion parameters that outperforms Google's Gemini Nano 2 & LIama 2. Phi-2 is small enough to run on a laptop or mobile device and delivers less toxicity and bias in its responses compared to other models.

It was also able to correctly answer complex physics problems and correct students' mistakes, similar to Google's Gemini Ultra model. Phi-2 is currently only licensed for research purposes and cannot be used for commercial purposes.

Source

Anthropic has Integrated Claude with Google Sheets

Anthropic launches a new prompt engineering tool that makes Claude accessible via spreadsheets. This allows API users to test and refine prompts within their regular workflows and spreadsheets, facilitating easy collaboration with colleagues.

(This allows you to execute interactions with Claude directly in cells.)

Everything you need to know and how to get started with it.

Source

Channel 1 launches AI news anchors with superhuman abilities

Channel 1 will use AI-generated news anchors that have superhuman abilities. These photorealistic anchors can speak any language and even attempt humor.

They will curate personalized news stories based on individual interests, using AI to translate and analyze data. The AI can also create footage of events that were not captured by cameras, while human reporters will be there for on-the-ground coverage.

Source

Enjoying the weekly updates?

Refer your pals to subscribe to our newsletter and get exclusive access to 400+ game-changing AI tools.

Refer a friend

When you use the referral link above or the “Share” button on any post, you'll get the credit for any new subscribers. All you need to do is send the link via text or email or share it on social media with friends.

Google’s new AI releases: Gemini API, MedLM, Imagen 2, MusicFX

Google is introducing a range of generative AI tools and platforms for developers and Google Cloud customers.

Gemini API in AI Studio and Vertex AI: Google is making Gemini Pro available for developers and enterprises to build for their own use cases.

Imagen 2 with text and logo generation and improved image quality with a host of other features.
MedLM: It is a family of foundation models fine-tuned for the healthcare industry.
MusicFX: It is a groundbreaking new experimental tool that enables users to generate their own music using AI.
Google also announced the general availability of Duet AI for Developers and Duet AI in Security Operations.

Source

Stability AI introduces Stable Zero123 for quality image-to-3D generation

Stable Zero123 generates novel views of an object, demonstrating 3D understanding of the object’s appearance from various angles– all from a single image input. It’s notably improved quality over Zero1-to-3 or Zero123-XL is due to improved training datasets and elevation conditioning.

The model is now released on Hugging Face to enable researchers and non-commercial users to download and experiment with it.

Source

OpenAI granting $10M to solve the alignment problem

OpenAI, in partnership with Eric Schmidt, is launching a $10 million grants program called "Superalignment Fast Grants" to support research on ensuring the alignment and safety of superhuman AI systems. They believe that superintelligence could emerge within the next decade, posing both great benefits and risks.

Source

Alibaba released ‘12VGen-XL’ image-to-video AI

Alibaba released 12VGen-XL, a new image-to-video model, It is capable of generating high-definition outputs. It uses cascaded diffusion models and static images as guidance to ensure alignment and enhance model performance. The model is optimized using a large dataset of text-video and text-image pairs. The source code and models will be publicly available.

Source

Intel’s new Core Ultra CPUs bring AI capabilities to PCs

Intel has launched its Intel Core Ultra mobile processors, which bring AI capabilities to PCs. These processors offer improved power efficiency, compute and graphics performance, and an enhanced AI PC experience.They will be used in over 230 AI PCs from partners such as Acer, ASUS, Dell, HP, Lenovo, and Microsoft Surface.

Source

That's all for now!

Subscribe to The AI Edge and gain exclusive access to content enjoyed by professionals from Moody’s, Vonage, Voya, WEHI, Cox, INSEAD, and other esteemed organizations.

Thanks for reading, and see you on Monday. 😊

The AI Edge

AI Weekly Rundown (December 09 to December 15)

Major AI announcements from Google, Microsoft, Runway, Stability AI, Mistral, and more.

We need your help!

Google releases NotebookLM with Gemini Pro

Mistral AI’s torrent-based release of Mixtral 8x7B

Berkeley Research’s real-world humanoid locomotion

Google introduces W.A.L.T, AI for photorealistic video generation

Runway introduces general world models

Alter3, a humanoid robot generating spontaneous motion using GPT-4

Microsoft released Phi-2, a SLM that beats the Llama 2

Anthropic has Integrated Claude with Google Sheets

Channel 1 launches AI news anchors with superhuman abilities

Enjoying the weekly updates?

Google’s new AI releases: Gemini API, MedLM, Imagen 2, MusicFX

Stability AI introduces Stable Zero123 for quality image-to-3D generation

OpenAI granting $10M to solve the alignment problem

Alibaba released ‘12VGen-XL’ image-to-video AI

Intel’s new Core Ultra CPUs bring AI capabilities to PCs

Discussion about this post