China’s DeepSeek Rattles America's AI Empire
Plus: OpenAI accuses DeepSeek of unauthorized use of its model outputs, A new benchmark exposes AI models' knowledge gaps, President Trump unveils $500B Stargate project, and more.
Hello Engineering Leaders and AI Enthusiasts!
This newsletter brings you the latest AI updates in a crisp manner! Dive in for a quick recap of everything important that happened around AI in the past two weeks.
And a huge shoutout to our amazing readers. We appreciate you😊
In today’s edition:
⚡ Deepseek R1: The AI disruptor making waves
🔍 OpenAI accuses DeepSeek of using its model outputs
🕵️♂️ "Humanity’s Last Exam" exposes AI’s knowledge gaps
🤖 OpenAI launches its first AI agent, "Operator"
💰 President Trump unveils the launch of $500B "Stargate"
💡 A new AI lab challenges traditional AGI approaches
🧠 Knowledge Nugget: When AI Promises Speed but Delivers Debugging Hell by
Let’s go!
Deepseek R1: The AI disruptor making waves
Chinese AI lab DeepSeek released DeepSeek-R1, an open-source reasoning model that matches or outperforms OpenAI’s o1 in key areas like math and coding—at a fraction of the cost.
With training expenses under $6M (compared to the hundreds of millions spent by U.S. rivals), R1’s low-cost, high-performance approach is shaking up the AI landscape. DeepSeek has also open-sourced all its models under an MIT license, making them free to use and download. It’s AI assistant climbed to the top of the Apple App Store’s free apps chart.
Why does it matter?
The AI model has fueled new U.S.-China tech debate. Despite U.S. chip sanctions, DeepSeek’s rapid rise raises questions about China’s AI capabilities and potential shifts in global tech power.
Further, it challenges the idea that size isn’t everything. Cutting-edge AI may not require massive scale, large fundings, and exclusive chip access—casting doubts on sky-high valuations for companies such as Nvidia (DeepSeek’s release has wiped out nearly $600B in Nvidia’s market cap).
OpenAI accuses DeepSeek of using its model outputs
OpenAI is claiming that DeepSeek may have improperly used OpenAI's model outputs to train its own AI, potentially violating OpenAI’s terms of service. The accusation revolves around "distillation," a common technique where smaller AI models are trained using outputs from larger ones.
OpenAI and Microsoft had already blocked suspected DeepSeek accounts last year for engaging in this practice. Now they are investigating whether data output from OpenAI’s technology was obtained in an unauthorized manner by a DeepSeek-linked group.
The U.S. Navy has banned the use of DeepSeek over potential security risks, while Meta has assembled ‘war rooms’ of engineers to determine how DeepSeek’s AI is beating competing AI at a fraction of the price.
Why does it matter?
DeepSeek’s low-cost, high-performance AI could disrupt the current AI ecosystem. With open-source AI, geopolitical tensions, and AI valuation concerns all colliding, DeepSeek’s rise is reshaping the AI landscape.
"Humanity’s Last Exam" exposes AI’s knowledge gaps
Scale AI and the Center for AI Safety have introduced "Humanity’s Last Exam" (HLE)—a new benchmark designed to push large language models (LLMs) to their limits. Featuring 3,000 expert-crafted questions across 100+ subjects, including math, humanities, and sciences, the test integrates multimodal challenges like images and diagrams.
Early results show top AI models like GPT-4, Claude 3.5, and DeepSeek scoring below 10%, with significant overconfidence in incorrect answers. With a $500K prize pool to encourage high-quality submissions, HLE aims to provide a more rigorous evaluation of AI’s true capabilities.
Why does it matter?
As AI systems surpass traditional benchmarks, HLE offers a tougher, more comprehensive test—helping measure progress better while highlighting current limitations.
OpenAI launches its first AI agent, "Operator"
OpenAI has introduced Operator, an AI agent that independently navigates web browsers to automate everyday tasks like filling forms, booking reservations, and ordering groceries. Built on the Computer-Using Agent (CUA) model, it leverages GPT-4o’s vision and reasoning to interact with websites via screenshots—without requiring special integrations. Here are other key details:
Scores 87% success on real-world sites like Amazon but struggles with complex workflows like combining PDFs from emails (38.1% on OSWorld benchmark).
Requires user approval for sensitive actions like payments.
Currently available as a research preview to in U.S. Pro users at $200/month, with broader rollout planned.
Why does it matter?
Operator marks OpenAI’s first major step into autonomous AI assistants, potentially reshaping human-AI interaction. While still in early stages, it signals the beginning of a new era for AI agents in real-world applications.
President Trump unveils $500B "Stargate" project
OpenAI, SoftBank, and Oracle have launched The Stargate Project, a $500 billion AI infrastructure initiative aimed at securing U.S. leadership in AI. The project, chaired by SoftBank’s Masayoshi Son, will start with a $100B investment to build large-scale data centers across the U.S., beginning in Texas. Key partners include Microsoft, NVIDIA, and Arm.
The announcement has sparked controversy, with Elon Musk questioning SoftBank’s financial backing, while Microsoft and Sam Altman defended the project’s stability.
Why does it matter?
With AI infrastructure at the center of geopolitical and economic competition, Stargate could reshape the AI race– but concerns over funding and regulatory oversight remain. However, the project is the largest AI infrastructure investment in history.
New AI lab challenges traditional AGI approaches
François Chollet and Zapier co-founder Mike Knoop has founded Ndea, a new AI lab aiming to achieve AGI through an alternative research method. Chollet, a former Google researcher and the creator of the popular Keras AI framework and the ARC-AGI benchmark.
Unlike large-scale deep learning, Ndea combines deep learning with program synthesis to build AI that learns and adapts with human-like efficiency. It aims to accelerate scientific discovery across various domains, including drug research.
Why does it matter?
François Chollet is a highly respected AI researcher, and his move signals diverse new paths in the AGI research, alongside efforts from leaders like Ilya Sutskever. A breakthrough may well come from outside giants like OpenAI or Meta.
Enjoying the latest AI updates?
Refer your pals to subscribe to our newsletter and get exclusive access to 400+ game-changing AI tools.
When you use the referral link above or the “Share” button on any post, you'll get the credit for any new subscribers. All you need to do is send the link via text or email or share it on social media with friends.
Knowledge Nugget: When AI Promises Speed but Delivers Debugging Hell
’s attempt to fast-track development using AI turned into a cautionary tale about the limits of AI-assisted coding. While building Codescribble (a collaborative text editor), Savage used Claude 3.5 Sonnet to handle most of the coding, completing initial development in just 3-4 hours. However, the seemingly quick win turned sour during deployment, with issues ranging from hardcoded localhost references to problematic Docker configurations.The experience revealed that AI can accelerate development but can't replace fundamental technical knowledge. Savage found himself in a "debugging hell" trying to fix Docker-related issues, ultimately learning that understanding the underlying technology was important for success.
Why does it matter?
This story emphasizes the need to balance AI's capabilities with maintaining technical expertise. It reminds us that AI tools are most effective only when used by developers who understand the fundamentals of what they're building rather than treating them as magical solution generators.
What Else Is Happening❗
📜The U.S. Copyright Office issues a new report ruling that AI-generated content isn’t copyrightable on its own but affirmed protections for human creators using AI as a tool.
🤖Perplexity AI launched Perplexity Assistant, a free agent-like tool for Android with multimodal and voice capabilities that can control apps and handle complex tasks.
🛡️Cisco announced AI Defense, a solution to protect AI systems from unauthorized tampering and data leaks with network-level safeguards and automated safety checks.
📊A new survey reveals that the share of U.S. teens who use ChatGPT for their schoolwork has doubled to 26%, while 73% haven’t used it this way.
💼Microsoft launched Microsoft 365 Copilot Chat, a rebranded entry-level version of its free AI assistant with pay-per-use agent features for business users.
🏆Google’s Imagen 3.0 debuted at No. 1 in the LM Text-to-Image Arena with a remarkable lead.
🚀Alibaba's Qwen team introduced Qwen2.5-VL, a visual AI model series that also functions as virtual agents and outperforms GPT-4o and Claude 3.5 Sonnet.
🏛️OpenAI introduced ChatGPT Gov, a version tailored to provide U.S. government agencies with an additional way to access OpenAI’s frontier models.
💡xAI's Grok-3 model went briefly live for some users ahead of its anticipated release next week, showcasing improved reasoning capabilities.
📱Apple's new iOS 18.3 system update turns on Apple Intelligence by default on devices, though AI summaries remain disabled.
🔓Hugging Face researchers launched Open-R1, a project aimed at building and open-sourcing a replica of DeepSeek's R1 model with all its components and training data.
New to the newsletter?
The AI Edge keeps engineering leaders & AI enthusiasts like you on the cutting edge of AI. From machine learning to ChatGPT to generative AI and large language models, we break down the latest AI developments and how you can apply them in your work.
Thanks for reading, and see you next week! 😊