iOS 18 May Have OpenAI Gen AI Capabilities
Plus: China's Vidu generates 16-second 1080P videos in one click, New S1 robot mimics human-like movements, speed, and precision
Hello Engineering Leaders and AI Enthusiasts!
Welcome to the 263rd edition of The AI Edge newsletter. This edition brings you details on Apple’s reignited talks with OpenAI to power iOS 18 with generative AI.
And a huge shoutout to our incredible readers. We appreciate you😊
In today’s edition:
🍎 iOS 18 may have OpenAI-powered gen AI Capabilities
🎥 China's Vidu generates 16-second 1080P videos, matching OpenAI's Sora
🤖 New S1 robot mimics human-like movements, speed, and precision
💡 Knowledge Nugget: What can LLMs never do? by
Let’s go!
iOS 18 may have OpenAI-powered gen AI capabilities
Apple has reportedly reinitiated talks with OpenAI to incorporate generative AI capabilities into the upcoming iOS 18 operating system, which will power the next generation of iPhones. The tech giant has been quietly exploring ways to enhance Siri and introduce new AI-powered features across its ecosystem. As of now, the companies are reportedly actively negotiating the terms of the agreement.
Apple is also in discussions with Google about licensing its Gemini chatbot technology. As of now, Apple hasn't made a final decision on which partners it will work with, and there's no guarantee that a deal will be finalized. The company may ultimately reach agreements with both OpenAI and Google or choose another provider entirely.
Why does this matter?
The renewed talks indicate Apple’s desperate attempt to accelerate its gen AI innovation and catch up with Big Tech. If successful, this collaboration would position Apple as a leader in AI-driven mobile devices, setting a new standard for chatbot-like interactions. Users can anticipate more sophisticated AI features, improved voice assistants, and a wider range of AI-powered applications on future iPhones.
China's Vidu generates 16-second 1080P videos, matching OpenAI's Sora
At the ongoing Zhongguancun Forum in Beijing, Chinese tech firm ShengShu-AI and Tsinghua University have unveiled Vidu, a text-to-video AI model. Vidu is said to be the first Chinese AI model on par with OpenAI's Sora, capable of generating 16-second 1080P video clips with a single click. The model is built on a self-developed visual transformation model architecture called Universal Vision Transformer (U-ViT), which integrates two text-to-video AI models: the Diffusion and the Transformer.
During a live demonstration, Vidu showcased its ability to simulate the real physical world, generating scenes with complex details that adhere to real physical laws, such as realistic light and shadow effects and intricate facial expressions. Vidu has a deep understanding of Chinese factors and can generate images of unique Chinese characters like pandas and loong (Chinese dragons).
Why does this matter?
Vidu's launch represents a technical and strategic achievement for China. No other text-to-video AI model has yet been developed with cultural nuances with the intention of preserving national identity. Moreover, the integration of Diffusion and Transformer models in U-ViT architecture pushes the boundaries of realistic and dynamic video generation, potentially reshaping what’s possible in creative industries.
New S1 robot mimics human-like movements, speed, and precision
Chinese robotics firm Astribot, a subsidiary of Stardust Intelligence, has previewed its advanced humanoid robot assistant, the S1. In a recently released video, the S1 shows remarkable agility, dexterity, and speed while doing various household tasks, marking a significant milestone in the development of humanoid robots.
Utilizing imitation learning, the S1 robot can execute intricate tasks at a pace matching adult humans. The video showcases the robot's impressive capabilities, like smoothly pulling a tablecloth from beneath a stack of wine glasses, opening and pouring wine, delicately shaving a cucumber, flipping a sandwich, etc. Astribot claims that the S1 is currently undergoing rigorous testing and is slated for commercial release in 2024.
Why does this matter?
The AI-powered humanoid robot industry is booming with innovation and competition. OpenAI recently introduced two impressive bots: one for folding laundry with "soft-touch" skills and another for natural language reasoning. Boston Dynamics unveiled the Atlas robot, and UBTech from China introduced its speaking bot, Walker S. Now, Astribot's S1 bot has amazed us with its incredible speed and precision in household tasks.
Enjoying the daily updates?
Refer your pals to subscribe to our daily newsletter and get exclusive access to 400+ game-changing AI tools.
When you use the referral link above or the “Share” button on any post, you'll get the credit for any new subscribers. All you need to do is send the link via text or email or share it on social media with friends.
Knowledge Nugget: What can LLMs never do?
In his detailed piece,
argues that despite their impressive capabilities, LLMs struggle with tasks like playing Wordle or solving Sudoku, which even young children can do. This is due to limitations in how LLMs process information, leading to "goal drift," where their attention mechanism loses focus on the overall objective with increasing reasoning steps.LLMs do a single forward pass of inference at a time and lack the ability to stop, gather world state, reason, revisit previous answers, or predict future ones unless explicitly detailed in the training data.
Krishnan proposes fundamental architectural changes, such as adding modules for reasoning, persistent memory, grounding in the physical world, and hierarchical planning agents to achieve human-like general intelligence.
However, Krishnan acknowledges that even if LLMs fall short of AGI, their current capabilities are remarkable. They showcase meaningful intelligence through advanced statistical pattern matching based on enormous datasets, albeit different from human cognition.
Why does this matter?
While LLMs can provide valuable insights and automate certain tasks, their inability to perform basic reasoning and maintain focus on overall objectives raises concerns about over-reliance on these models. With their application and adoption expanding to various industries, decision-makers need to thoroughly evaluate their limitations and potential risks before implementing them in high-stakes scenarios.
What Else Is Happening❗
💄 Estée Lauder and Microsoft's collaboration for beauty brands
Estée Lauder Companies (ELC) and Microsoft have launched the AI Innovation Lab to help ELC's brands leverage generative AI. The collaboration aims to enable faster responses to social trends and consumer demands, as well as accelerate product innovation. (Link)
🚀 Oracle boosts Fusion Cloud apps with 50+ generative AI capabilities
Oracle has launched new generative AI features across its Fusion Cloud CX suite to help sales, marketing, and service agents automate and accelerate critical workflows. The AI capabilities will enable contextually-aware responses, optimized schedules for on-field service agents, targeted content creation, and AI-based look-alike modeling for contacts. (Link)
💬 Google's new AI feature helps users practice English conversations
The chatbot, currently available in select countries through Search Labs or Google Translate on Android, provides feedback and helps users find the best words and conjugations within the context of a conversation. (Link)
🧠 OpenAI enhances ChatGPT with user-specific memory update
The update enables ChatGPT to provide more personalized and contextually relevant responses over time by storing details about users' preferences and interactions. Users have control over the memory feature, including the ability to toggle it on or off, inspect stored information, and delete specific data entries. (Link)
🤝 Tech CEOs join DHS advisory board on AI safety and security
The US DHS has announced a blue-ribbon board that includes CEOs of major tech companies to advise the government on the role of AI in critical infrastructure. They will develop recommendations to prevent and prepare for AI-related disruptions to critical services that impact national economic security, public health, or safety. (Link)
New to the newsletter?
The AI Edge keeps engineering leaders & AI enthusiasts like you on the cutting edge of AI. From ML to ChatGPT to generative AI and LLMs, We break down the latest AI developments and how you can apply them in your work.
Thanks for reading, and see you tomorrow. 😊