iOS 18 to Have AI Features with On-Device Processing
Plus: Many-shot-in-context learning is a breakthrough in improving LLM performance, Groq shatters AI inference speed record with 800 tokens/second on LLaMA 3
Hello Engineering Leaders and AI Enthusiasts!
Welcome to the 258th edition of The AI Edge newsletter. This edition brings you details on Apple’s plans to introduce AI features with complete on-device processing.
And a huge shoutout to our incredible readers. We appreciate you😊
In today’s edition:
🍎 iOS 18 to have AI features with on-device processing
🧠 Many-shot ICL is a breakthrough in improving LLM performance
⚡ Groq shatters AI inference speed record with 800 tokens/second on LLaMA 3
💡 Knowledge Nugget: Is Humane's AI Pin Today's General Magic? by
Let’s go!
iOS 18 to have AI features with complete on-device processing
Apple is set to make significant strides in artificial intelligence with the upcoming release of iOS 18. According to AppleInsider’s recent report, the tech giant is focusing on privacy-centric AI features that will function entirely on-device, eliminating the need for cloud-based processing or an internet connection. This approach addresses concerns surrounding AI tools that rely on server-side processing, which have been known to generate inaccurate information and compromise user privacy.
The company is reportedly developing an in-house LLM called "Ajax," which will power AI features in iOS 18. Users can expect improvements to Messages, Safari, Spotlight Search, and Siri, with basic text analysis and response generation available offline. We'll learn more about Apple's AI plans at the Worldwide Developers Conference (WWDC) starting June 10.
Why does this matter?
Apple’s commitment to user data privacy is commendable, but eliminating cloud-based processing and internet connectivity may impede the implementation of more advanced features. Nevertheless, it presents an opportunity for Apple to differentiate itself from competitors by offering users a choice between privacy-focused on-device processing and more powerful cloud-based features.
Many-shot-in-context learning is a breakthrough in improving LLM performance
A recent research paper has introduced a groundbreaking technique that enables LLMs to significantly improve performance by learning from hundreds or thousands of examples provided in context. This approach, called many-shot in-context learning (ICL), has shown superior results compared to the traditional few-shot learning method across a wide range of generative and discriminative tasks.
To address the limitation of relying on human-generated examples for many-shot ICL, the researchers explored two novel settings: Reinforced ICL, which uses model-generated chain-of-thought rationales instead of human examples, and Unsupervised ICL, which removes rationales from the prompt altogether and presents the model with only domain-specific questions.
Both approaches have proven highly effective in the many-shot regime, particularly for complex reasoning tasks. Furthermore, the study reveals that many-shot learning can override pretraining biases and learn high-dimensional functions with numerical inputs, unlike few-shot learning, showcasing its potential to revolutionize AI applications.
Why does this matter?
Many-shot ICL allows for quick adaptation to new tasks and domains without the need for extensive fine-tuning or retraining. However, the success of many-shot ICL heavily depends on the quality and relevance of the examples provided. Moreover, as shown by Anthropic’s jailbreaking experiment, some users could use this technique to intentionally provide carefully crafted examples designed to exploit vulnerabilities or introduce biases, leading to unintended and dangerous consequences.
Groq shatters AI inference speed record with 800 tokens/second on LLaMA 3
AI chip startup Groq recently confirmed that its novel processor architecture is serving Meta's newly released LLaMA 3 large language model at over 800 tokens per second. This translates to generating about 500 words of text per second - nearly an order of magnitude faster than the typical speeds of large models on mainstream GPUs. Early testing by users seems to validate the claim.
Groq's Tensor Streaming Processor is designed from the ground up to accelerate AI inference workloads, eschewing the caches and complex control logic of general-purpose CPUs and GPUs. The company asserts this "clean sheet" approach dramatically reduces the latency, power consumption, and cost of running massive neural networks.
Why does this matter?
If the LLaMA 3 result holds up, it could shake up the competitive landscape for AI inference, challenging Nvidia's dominance of GPUs and increasing the demand for purpose-built AI hardware for faster and more cost-effective inference solutions. Also, Groq’s capabilities could revolutionize software solutions that depend on real-time AI, such as virtual assistants, chatbots, and interactive customer services.
Enjoying the daily updates?
Refer your pals to subscribe to our daily newsletter and get exclusive access to 400+ game-changing AI tools.
When you use the referral link above or the “Share” button on any post, you'll get the credit for any new subscribers. All you need to do is send the link via text or email or share it on social media with friends.
Knowledge Nugget: Is Humane's AI Pin Today's General Magic?
The Humane AI Pin, a screenless, wearable AI device that clips onto clothing, has recently been released to mixed reviews, according to tech blogger
.While the device boasts impressive engineering, including a camera, microphone, touchpad, and laser projector, initial reviews have been less than stellar. Critics have pointed out shortcomings like the high price tag, slow response times, and lack of integration with existing mobile apps.
Photo credit: The Verge
Despite these challenges, Greg believes that Humane's attempt to introduce a new type of AI-powered wearable device is noteworthy in a market saturated with smartphones. He draws parallels to General Magic, a company founded by former Apple employees in the 1990s, which created innovative technology that was ahead of its time but ultimately failed due to market readiness and infrastructure limitations.
As the AI Pin faces similar hurdles, it remains to be seen how Humane will address the feedback and evolve the product to meet user expectations.
Why does this matter?
By decoupling AI from smartphones and giving it a dedicated wearable form factor, Humane is betting that users will prefer a hands-free interaction with AI. If successful, this could pressure other AI players like Apple, Google, and Amazon to develop their own AI-powered wearables, leading to a new wave of innovation in the space. However, the AI Pin's high price and limited functionality compared to smartphones could hinder its adoption, at least in the short term.
What Else Is Happening❗
🤖 Israel-based startup enters AI humanoid race with Menteebot
Israel-based startup Mentee Robotics has unveiled Menteebot, an AI-driven humanoid robot prototype for home and warehouse use. It employs transformer-based large language models, NeRF-based algorithms, and simulator-to-reality machine learning to understand commands, create 3D maps, and perform tasks. The finalized Menteebot is anticipated to launch in Q1 2025. (Link)
🩺 Hugging Face introduces benchmark for evaluating gen AI in healthcare
The benchmark combines existing test sets to assess medical knowledge and reasoning across various fields. It’s a starting point for evaluating healthcare-focused AI models, but experts caution against relying solely on the benchmark and emphasize the need for thorough real-world testing. (Link)
🔄 Google announces major restructuring to accelerate AI development
The changes involve consolidating AI model building at Google Research and DeepMind, focusing Google Research on foundational breakthroughs and responsible AI practices, and introducing a new "Platforms & Devices" product area. (Link)
🎧 Nothing's new earbuds offer ChatGPT integration
Nothing Ear and Nothing Ear (a) allow users to ask questions by pinching the headphones' stem, provided the ChatGPT app is installed on a connected Nothing handset. The earbuds offer improved sound quality, better noise-canceling, and longer battery life than their predecessors. (Link)
🚪 Japanese researchers develop AI tool to predict employee turnover
The tool analyzes employee data, such as attendance records and personal information, and creates a turnover model for each company. By predicting which new recruits are likely to quit, the AI tool enables managers to offer targeted support to those employees and potentially reduce turnover rates. (Link)
New to the newsletter?
The AI Edge keeps engineering leaders & AI enthusiasts like you on the cutting edge of AI. From ML to ChatGPT to generative AI and LLMs, We break down the latest AI developments and how you can apply them in your work.
Thanks for reading, and see you tomorrow. 😊
I don’t think Apple will keep the AI tools *only* on device for long. It’s cool that they are starting with all on device processing which should enable some novel use cases on the iPhone later this year. Where appropriate from a security and processing perspective though I think they’ll start adding in cloud computed functionalities enabled by their acquisition of many AI startups over the last few years as well as leveraging their partnership with Google (Gemini Siri is definitely on the way).