Thought-to-Video AI Pushes Innovation Limits

Plus: Bing's New Features, DragGAN: AI model for point-based image manipulation

May 22, 2023

Hello, Engineering Leaders and AI Enthusiasts,

Welcome to the 24th edition of The AI Edge newsletter. In today’s edition, we bring you the ground-breaking thought-to-video AI model, Mind-Video. Thank you everyone who is reading this. 😊

In today’s edition:

🧠 Mind-Video: High-quality video reconstruction from brain activity
🚀 Microsoft’s AI-powered Bing gets new features
🤖 DragGAN: New AI model for point-based image manipulation
📚 How to reduce the cost of using LLM APIs by 98%

Let’s go!

Mind-Video: High-quality video reconstruction from brain activity

Mind-Video is a method for reconstructing continuous visual experiences in videos from non-invasive brain recordings, specifically continuous fMRI data of the cerebral cortex. The proposed approach combines masked brain modeling, multimodal contrastive learning with spatiotemporal attention, and co-training with an augmented Stable Diffusion model.

Adversarial guidance is utilized to achieve high-quality video reconstruction with arbitrary frame rates. The reconstructed videos were evaluated using semantic and pixel-level metrics, showing an average accuracy of 85% in semantic classification tasks and a structural similarity index (SSIM) of 0.19, surpassing the previous state-of-the-art by 45%. The model is biologically plausible and interpretable, aligning with established physiological processes.

Why does this matter?

The progress made in technologies like Mind-Video has the potential to advance our knowledge, improve medical treatments, and drive innovation in AI, ultimately benefiting humanity by enhancing our understanding of the human brain and opening up new possibilities for medical interventions and technological advancements.

Source

Microsoft’s AI-powered Bing gets new features like chat history, charts, exports & more

Microsoft has been incorporating new features and enhancing its responses since it unveiled its brand-new Bing powered by AI. Several features have been shipped in the latest update and are now fully available to users. These updates include:

Chat history: Save and access previous conversations easily

Charts and visualizations: Generate visual representations of data.
Export: Export chat answers to PDF, text files, or Word documents.
Video overlay: Watch full-screen videos in response to specific queries.
Optimized recipe answers: Improved design for recipe-related information.
Share fixes: Resolved issues with the Share dialog.
Auto-suggest quality: Enhanced word suggestions for faster interactions.
Privacy improvements in Edge sidebar: Better privacy for conversations involving private or local content.

Why does this matter?

The updates might help Microsoft attract more users for Bing. Google made a lot of noise and attracted a lot of eyeballs in the I/O event. The Bing updates could be seen as a retaliation to Google’s announcements. However, only time will tell which tech behemoth owns the space.

Source

DragGAN: A new AI model for interactive point-based image manipulation

Researchers from Google, MIT, and Max Planck Institute for Informatics have proposed DragGAN. The novel approach allows a user to "drag" any points of the image to precisely reach target points in a user-interactive manner, as shown below.

Through DragGAN, anyone can deform an image with precise control over where pixels go, thus manipulating the pose, shape, expression, and layout of diverse categories such as animals, cars, humans, landscapes, etc.

Why does this matter?

Move over Photoshop! While DragGAN will reduce image editing workflows from hours to minutes, it represents a significant leap forward for AI models and their capabilities. The breakthrough development signifies that AI models can be more flexible, versatile, and capable of meeting users' specific needs.

Source

Knowledge Nugget: How to reduce the cost of using LLM APIs by 98%

Cost is still a major factor when scaling services on top of LLM APIs. It can get very expensive, especially when using LLMs on large collections of queries and text.

This article suggests three strategies, outlined in a recent study from Stanford University, that can reduce the cost associated with using LLMs. One of the experiments also proposes a new model, FrugalGPT, that can match the performance of individual LLMs today (e.g., GPT-4) with up to 98% cost reduction or improve the accuracy over GPT-4 by 4% with the same cost.

Why does this matter?

The findings and strategies presented in the study enable cost-efficient usage of LLMs, making them more accessible and sustainable in the AI landscape. Moreover, they attack the problem of high inference costs from a different angle. This allows businesses to be more cost-effective without relying on the underlying models to get cheaper. As a result, it will widen the use cases and utilization of LLM in problem-solving.

Source

What Else Is Happening

🔮 Explore mesmerizing 360° worlds brought to life from a sketch and Stable Diffusion (Link)

👃 MS Artificial Nose, A smart device that identifies smells (Link)

🚀 Copilot arrived: The next generation of Perplexity AI search companion (Link)

🎶 SoundStorm unleashes the instant audio magic (Link)

🤖 AI Robots in the U.S. change tires in half the time as humans (Link)

🔬 AI uncovers rare DNA sequence in gene activation research (Link)

Trending Tools

Copernic AI: Generate 360° panoramas and little planets in seconds with Copernic AI for immersive VR experiences.
Slayer: Create audio stories, podcasts, and meditations instantly. Juicebox feature coming soon for curated daily news.
Codemorph: Effortlessly convert code between programming languages to enhance collaboration and unlock new possibilities.
AI Reads: Stay informed with AIReads as it summarizes and reads news articles to you anytime, anywhere.
Vest: Make informed stock decisions with VEST, the AI-powered platform offering real-time analysis and insights.
Copying AI: Convert YouTube videos into blog posts effortlessly with Copyingai. No transcript is required for text extraction.
Jottery: Empower engineers with efficient communication, market research, and project planning using Jottery's AI tool.
Zeda 2.0: AI-powered platform for product discovery & strategy. Uncover problems, build purposefully, drive outcomes.

That's all for now!

If you are new to ‘The AI Edge’ newsletter. Subscribe to receive the ‘Ultimate AI tools and ChatGPT Prompt guide’ specifically designed for Engineering Leaders and AI enthusiasts.

Thanks for reading, and see you tomorrow.

The AI Edge

Thought-to-Video AI Pushes Innovation Limits

Plus: Bing's New Features, DragGAN: AI model for point-based image manipulation

Mind-Video: High-quality video reconstruction from brain activity

Microsoft’s AI-powered Bing gets new features like chat history, charts, exports & more

DragGAN: A new AI model for interactive point-based image manipulation

Knowledge Nugget: How to reduce the cost of using LLM APIs by 98%

What Else Is Happening

Trending Tools

Discussion about this post