ChatGPT Builds Robots
Plus: Magic123, a novel image-to-3D pipeline & MS CoDi for any-to-any generation.
Hello Engineering Leaders and AI Enthusiasts!
Welcome to the 54th edition of The AI Edge newsletter. This edition brings you an experiment where Microsoft used ChatGPT for robotics applications.
And a huge shoutout to our amazing readers. We appreciate you!😊
In today’s edition:
🤖 ChatGPT builds robots: New research
⚡️ Magic123 creates HQ 3D meshes from unposed images
🎯 Any-to-any generation: Next stage in AI evolution
📚 Knowledge Nugget: Building Boba AI
Let’s go!
ChatGPT builds robots: New research
Microsoft Research presents an experimental study using OpenAI’s ChatGPT for robotics applications. It outlines a strategy that combines design principles for prompt engineering and the creation of a high-level function library that allows ChatGPT to adapt to different robotics tasks, simulators, and form factors.
The study encompasses a range of tasks within the robotics domain, from basic logical, geometrical, and mathematical reasoning to complex domains such as aerial navigation, manipulation, and embodied agents.
Microsoft also released PromptCraft, an open-source platform where anyone can share examples of good prompting schemes for robotics applications.
Why does this matter?
The experiment investigates how the abilities of LLMs like ChatGPT can generalize to the robotics domain and solve several tasks beyond that via natural language instructions. It can revolutionize several industries already employing robotics with advanced solutions like healthcare, automotive, etc.
Magic123 creates HQ 3D meshes from unposed images
New research from Snap Inc. (and others) presents Magic123, a novel image-to-3D pipeline that uses a two-stage coarse-to-fine optimization process to produce high-quality high-resolution 3D geometry and textures. It generates photo-realistic 3D objects from a single unposed image.
The core idea is to use 2D and 3D priors simultaneously to generate faithful 3D content from any given image. Magic123 achieves state-of-the-art results in both real-world and synthetic scenarios.
Why does this matter?
While the realistic 2D image data available online is voluminous, there is a shortage of 3D datasets, which hinders training large-scale learning of 3D geometry for AI models. Plus, working with 3D data requires significant computational resources, or you’ll end up with low-resolution images.
This approach addresses these issues and significantly improves over previous image-to-3D techniques in terms of both efficiency and level of detail.
Any-to-any generation: Next stage in AI evolution
Microsoft presents CoDi, a novel generative model capable of processing and simultaneously generating content across multiple modalities. It employs a novel composable generation strategy that involves building a shared multimodal space by bridging alignment in the diffusion process. This enables the synchronized generation of intertwined modalities, such as temporally aligned video and audio.
One of CoDi’s most significant innovations is its ability to handle many-to-many generation strategies, simultaneously generating any mixture of output modalities. CoDi is also capable of single-to-single modality generation and multi-conditioning generation.
Why does this matter?
Composable Diffusion marks a significant step toward more engaging and holistic human-computer interactions. It establishes a solid foundation for future investigations in generative AI. Moreover, CoDi unlocks numerous possibilities for real-world applications requiring multimodal integration.
Knowledge Nugget: Building Boba AI
Boba is an experimental AI co-pilot for product strategy & generative ideation, designed to augment the creative ideation process. It mediates an interaction between a human user and an LLM, currently GPT 3.5.
In this article, the author talks about the lessons learned on how to build LLM-powered generative co-pilot applications, which he has formulated in terms of patterns. The building of the application is also aimed at answering:
How to design and build generative experiences beyond chat, powered by LLMs
How to use AI to augment product and strategy processes and craft
Why does this matter?
The patterns outlined in the article can help you enhance the performance and results of your own LLM-powered applications. In addition, the article also provides best practices and insights into how to work around the limitations of your AI application and enhance its capabilities.
What Else Is Happening❗
🎥Lights, Text, Action! Meta's MusicGen + Zeroscope can craft brilliant videos (Link)
📲Human’s first product is an AI-powered wearable device with projected display (Link)
🪄Microsoft is giving early users a sneak peek at its AI assistant for Windows 11 (Link)
🌀Midjourney released a “weird” parameter that can give images a crazy twist! (Link)
💡Nvidia acquired OmniML, an AI startup that shrinks machine-learning models (Link)
🧪The first drug fully generated by AI entered clinical trials with human patients (Link)
🚀Moonlander launches AI-based platform for immersive 3D game development (Link)
🛠️ Trending Tools
Game of Prompts: Next-level QR code generator with visually stunning codes embedded in images. Free to use.
BuyLens AI: AI-powered Chrome extension for capturing and tracking products/travel/real estate. Manage expenses and budget easily.
AI Studio: ChatGPT4 powered video creator for mind-blowing studio-quality videos in seconds, in any language. Earn $400/day profit.
Myreader AI: Chat with books or a library, jump to specific pages, and access up to 20,000 pages in the cloud anytime.
QuickNoter: Online tool powered by GPT & AI models for efficient note-taking from various sources. Create a searchable knowledge base.
Bank Statement to CSV/Excel: Convert unstructured bank statements into organized CSV/Excel files in seconds, eliminating manual data entry and errors.
Odience: AI-powered interest selector for Meta Ads. Build smart audiences based on unique interests.
Fronts AI: Mobile-friendly website manager for launching, skills listing, payments, and meetings.
That's all for now!
If you are new to ‘The AI Edge’ newsletter. Subscribe to receive the ‘Ultimate AI tools and ChatGPT Prompt guide’ specifically designed for Engineering Leaders and AI enthusiasts.
Thanks for reading, and see you tomorrow. 😊