GPT-4V has a new competitor, LLaVA-1.5

Plus: Perplexity.ai and GPT-4 outperform Google Search, Microsoft to debut AI chip.

Oct 09, 2023

Hello Engineering Leaders and AI Enthusiasts!

Welcome to the 121st edition of The AI Edge newsletter. This edition brings you GPT-4V’s newest competitor, LLaVA-1.5.

And a huge shoutout to our incredible readers. We appreciate you😊

In today’s edition:

🤖 OpenAI’s GPT-4 Vision might have a new competitor, LLaVA-1.5
🔥 Perplexity.ai and GPT-4 can outperform Google Search
🚀 Microsoft to debut AI chip and cut Nvidia GPU costs
📚 Knowledge Nugget: A Historical Moment: Meta AI Just Crossed the Uncanny Valley by
Jurgen Gravestein

Let’s go!

OpenAI’s GPT-4 Vision might have a new competitor, LLaVA-1.5

Microsoft Research and the University of Wisconsin present new research that shows that the fully-connected vision-language cross-modal connector in LLaVA is surprisingly powerful and data-efficient.

The final model, LLaVA-1.5 (with simple modifications to the original LLaVA) achieves state-of-the-art across 11 benchmarks. It utilizes merely 1.2M public data, trains in ~1 day on a single 8-A100 node, and surpasses methods that use billion-scale data. And it might just be as good as GPT-4V in responses.

Why does this matter?

Large multimodal models (LMMs) are becoming increasingly popular and may be the key building blocks for general-purpose assistants. The LLaVA architecture is leveraged in different downstream tasks and domains, including biomedical assistants, image generation, and more. The above research establishes stronger, more feasible, and affordable baselines for future models.

Source

Perplexity.ai and GPT-4 can outperform Google Search

New research by Google, OpenAI, and the University of Massachusetts presents FreshPrompt and FreshAQ. FreshQA is a novel dynamic QA benchmark that includes questions that require fast-changing world knowledge as well as questions with false premises that need to be debunked.

FreshPrompt is a simple few-shot prompting method that substantially boosts the performance of an LLM on freshQA by incorporating relevant and up-to-date information retrieved from a search engine into the prompt. Its experiments show that FreshPrompt outperforms both competing search engine-augmented prompting methods such as Self-Ask as well as commercial systems such as Perplexity.ai.

FreshPrompt’s format:

Why does this matter?

While the research gives a “fresh” look at LLMs in the context of factuality, it also introduces a new technique that incorporates more information from Google Search together with smart reasoning and improves GPT-4 performance from 29% to 76% on FreshQA. Will it make AI models better and slowly replace Google search?

Source

Microsoft to debut AI chip and cut Nvidia GPU costs

Microsoft plans to unveil its first chip designed for AI at its annual developers’ conference next month. Similar to Nvidia GPUs, the chip will be designed for data center servers that train and run LLMs, and is codenamed Athena.

Microsoft’s data center servers currently use Nvidia GPUs to power cutting-edge LLMs for cloud customers, including OpenAI and Intuit, as well as for AI features in Microsoft’s productivity apps.

Why does this matter?

The move will allow Microsoft to reduce its reliance on Nvidia-designed AI chips, which have been in short supply as demand for them has boomed.

Additionally, it could lead to a return on Microsoft’s investment in OpenAI, which has reportedly raised concerns about expensive costs of hardware required to power its AI models and is, thus, also exploring making its own chips.

Source

Enjoying the daily updates?

Refer your pals to subscribe to our daily newsletter and get exclusive access to 400+ game-changing AI tools.

Refer a friend

When you use the referral link above or the “Share” button on any post, you'll get the credit for any new subscribers. All you need to do is send the link via text or email or share it on social media with friends.

Knowledge Nugget: A Historical Moment: Meta AI Just Crossed the Uncanny Valley

In this thought-provoking article,

Jurgen Gravestein

discusses a groundbreaking moment where Mark Zuckerberg, Meta’s CEO, was interviewed by Lex Friedman in a shared virtual space using Meta's VR headsets, featuring photorealistic avatars. This event marks the crossing of the "uncanny valley" for digital avatars in VR, making virtual interactions more lifelike and intimate.

But that’s not it. Meta's vision involves blending the digital and physical worlds, with a focus on AI-powered entertainment, including celebrity-based AI characters. The article emphasizes that what we see is just the beginning, and Meta's direction may redefine the future of virtual reality and entertainment.

Why does this matter?

Meta's achievement is a significant step forward in AI and VR integration. But has it really crossed the AI uncanny valley completely? I think that remains to be seen depending on how quickly this technology evolves and how widely it is adopted by users and integrated into various applications.

Source

What Else Is Happening❗

🎨Adobe to announce a revolutionary AI-powered photo editing tool

It teased a fraction of the capabilities of the new “object-aware editing engine”– dubbed Project Stardust– in a promotional video. More news is expected at the Adobe Max event tomorrow. (Link)

💼China plans big AI and computing buildup to benefit local firms

It aims to grow the country’s computing power by more than a third in less than three years, a move set to benefit local suppliers and boost technology self-reliance as US sanctions pressure domestic industry. (Link)

✅BBC blocked OpenAI data scraping but is open to AI-powered journalism

It has blocked web crawlers from OpenAI and Common Crawl from accessing BBC websites. But it plans to work with tech companies, other media organizations, and regulators to safely develop generative AI and focus on maintaining trust in the news industry. (Link)

🔍The U.N. and Netherlands launched a project to help Europe prepare for AI supervision

In the project, UNESCO will assemble information about how European countries are currently supervising AI and put together a list of “best practices” recommendations. The Dutch digital infrastructure agency (RDI) will aid UNESCO. (Link)

💰Snoop Dogg joins the AI arms race, invests in AI language startup THINKIN

Built upon OpenAI's GPT technology, THINKIN's AI is carefully customized and fine-tuned for the explicit purpose of teaching foreign languages. (Link)

That's all for now!

Subscribe to The AI Edge and join the impressive list of readers that includes professionals from Moody’s, Vonage, Voya, WEHI, Cox, INSEAD, and other reputable organizations.

Thanks for reading, and see you tomorrow. 😊

The AI Edge

GPT-4V has a new competitor, LLaVA-1.5

Plus: Perplexity.ai and GPT-4 outperform Google Search, Microsoft to debut AI chip.

OpenAI’s GPT-4 Vision might have a new competitor, LLaVA-1.5

Perplexity.ai and GPT-4 can outperform Google Search

Microsoft to debut AI chip and cut Nvidia GPU costs

Enjoying the daily updates?

Knowledge Nugget: A Historical Moment: Meta AI Just Crossed the Uncanny Valley

What Else Is Happening❗

Discussion about this post