AI Weekly Rundown (November 25 to December 01)
Major AI announcements from Microsoft, Amazon, Google DeepMind, Pika, and more.
Hello Engineering Leaders and AI Enthusiasts!
Another eventful week in the AI realm. Lots of big news from huge enterprises, with major updates from Amazon.
In today’s edition:
😎 A new technique from researchers accelerate LLMs by 300x🌐
AI tool 'screenshot-to-code' generates entire code from screenshots
🤖 Microsoft Research explains why hallucination is necessary in LLMs!
🎁 Amazon is using AI to improve your holiday shopping
🧠 AI algorithms are powering the search for cells
🚀 AWS adds new languages and AI capabilities to Amazon Transcribe
💼 Amazon announces Q, an AI chatbot tailored for businesses✨
Amazon launches 2 new chips for training + running AI models
🎥 Pika officially reveals Pika 1.0, idea-to-video platform
🖼️ Amazon’s AI image generator, and other AWS re:Invent updates
💡 Perplexity introduces PPLX online LLMs
💎 DeepMind’s AI tool finds 2.2M new crystals to advance technology
🎭 Meta's new models make communication seamless for 100 languages
🚗 Researchers release Agent-driver, uses LLMs for autonomous driving
💳 Mastercard launches an AI service to help you find the perfect gift
Let’s go!
This new technique accelerates LLMs by 300x
Researchers at ETH Zurich have developed a new technique UltraFastBERT, a language model that uses only 0.3% of its neurons during inference while maintaining performance. It can accelerate language models by 300 times. And by introducing "fast feedforward" layers (FFF) that use conditional matrix multiplication (CMM) instead of dense matrix multiplications (DMM), the researchers were able to significantly reduce the computational load of neural networks.
They validated their technique with FastBERT, a modified version of Google's BERT model, and achieved impressive results on various language tasks. The researchers believe that incorporating fast feedforward networks into large language models like GPT-3 could lead to even greater acceleration.
Read the Paper here.
AI tool 'Screenshot-to-Code' generates entire code
GitHub user abi has created a tool called "screenshot-to-code" that allows users to convert a screenshot into clean HTML/Tailwind CSS code. The tool utilizes GPT-4 Vision to generate the code and DALL-E 3 to generate visually similar images. Users can also input a URL to clone a live website.
All you want to do is: Upload any screenshot of a website and watch AI build the entire code. It will improve the generated code by comparing it against the screenshot repeatedly.
Microsoft Research explains why Hallucination is necessary in LLMs!
Microsoft Research + 4 others have explored that there is a statistical reason behind these hallucinations, unrelated to the model architecture or data quality. For arbitrary facts that cannot be verified from the training data, hallucination is necessary for language models that satisfy a statistical calibration condition.
However, the analysis suggests that pretraining does not lead to hallucinations on facts that appear more than once in the training data or on systematic facts. Different architectures and learning algorithms may help mitigate these types of hallucinations.
Amazon is using AI to improve your holiday shopping
This holiday season, Amazon is using AI to power and enhance every part of the customer journey. Its new initiatives include:
Supply Chain Optimization Technology (SCOT): It helps forecast demand for more than 400 million products each day, using deep learning and massive datasets to decide which products to stock in which quantities at which Amazon facility.
AI-enabled robots: AI is also helping Amazon orchestrate the world’s largest fleet of mobile industrial robots. They help recognize, sort, inspect, package, and load millions of diverse goods.
A robot called “Robin” helps sort packages for fast delivery: It uses an AI-enhanced vision system to understand what objects are there– different-sized boxes, soft packages, and envelopes on top of each other.
AI helps predict the unpredictable on the road: Whether it's bad weather or traffic, or a truck with products might come to the station early.
Picking the best delivery routes: Route design and optimization is notoriously one of the most difficult problems for Amazon. It uses over 20 ML models that work in concert behind the scenes.
In addition, delivery teams are exploring the use of generative AI and LLMs to simplify decisions for drivers: by clarifying customer delivery notes, building outlines, road entry points, and much more.
(Source)
AI algorithms are powering the search for cells
Deep learning is driving the rapid evolution of algorithms that can automatically find and trace cells in a wide range of microscopy experiments. New models are reaching unprecedented accuracy heights.
A new paper by Nature details how AI-powered image analysis tools are changing the game for microscopy data. It highlights the evolution from early, labor-intensive methods to machine learning-based tools like CellProfiler, ilastik, and newer frameworks such as U-Net. These advancements enable more accurate and faster segmentation of cells, essential for various biological imaging experiments.
Cancer-cell nuclei (green boxes) picked out by software using deep learning.
AWS adds new languages and AI capabilities to Amazon Transcribe
As announced during AWS re:Invent, the cloud provider added new languages and a slew of new AI capabilities to Amazon Transcribe. The product will now offer generative AI-based transcription for 100 languages. AWS ensured that some languages were not over-represented in the training data to ensure that lesser-used languages could be as accurate as more frequently spoken ones.
It also offers automatic punctuation, custom vocabulary, automatic language identification, and custom vocabulary filters. It can recognize speech in audio and video formats and noisy environments.
Amazon announces Q, an AI chatbot tailored for businesses
Amazon has announced Q, an AI chatbot tailored for businesses that is designed to assist AWS customers. Q can answer questions, generate content, and act on users' behalf. It is trained on 17 years' worth of AWS knowledge and can provide potential solutions to user queries.
Q can also analyze data, generate reports, and troubleshoot network connectivity issues. The chatbot is customizable and can be integrated with various apps and software. According to analysts, Q is a significant announcement and aims to arm developers with AI to enhance their productivity.
Amazon launches 2 new chips for training + running AI models
Amazon announces 2 new chips for training and running AI models; here are they:
1) The Trainium2 chip is designed to deliver better performance and energy efficiency than its predecessor and a cluster of 100,000 Trainium chips can train a 300-billion parameter AI language model in weeks.
2) The Graviton4 chip: The fourth generation in Amazon's Graviton chip family, provides better compute performance, more cores, and increased memory bandwidth. These chips aim to address the shortage of GPUs in high demand for generative AI. The Trainium2 chip will be available next year, while the Graviton4 chip is currently in preview.
Pika officially reveals Pika 1.0, idea-to-video platform
Pika, a video-making platform, has announced its major product upgrade, Pika 1.0, which includes an AI model capable of generating and editing videos in various styles. The company aims to make video creation effortless and accessible to everyone.
Pika has already grown its user base to half a million users, generating millions of videos per week. Additionally, Pika has raised $55 million in funding, with investments from industry leaders and AI experts. The platform allows users to join the waitlist for Pika 1.0 on their website.
📢 Invite friends and get rewards 🤑🎁
Enjoying AI updates? Refer friends and get perks and special access to The AI Edge.
Get 400+ AI Tools and 500+ Prompts for 1 referral.
Get A Free Shoutout! for 3 referrals.
Get The Ultimate Gen AI Handbook for 5 referrals.
When you use the referral link above or the “Share” button on any post, you'll get credit for any new subscribers. Simply send the link in a text, email or share it on social media with friends.
Amazon’s AI image generator, and other announcements from AWS re:Invent (Nov 29)
Titan Image Generator: Titan isn’t a standalone app or website but a tool that developers can build on to make their own image generators powered by the model. To use it, developers will need access to Amazon Bedrock. It’s aimed squarely at an enterprise audience, rather than the more consumer-oriented focus of well-known existing image generators like OpenAI’s DALL-E. (Source)
Amazon SageMaker HyperPod: AWS introduced Amazon SageMaker HyperPod, which helps reduce time to train foundation models (FMs) by providing a purpose-built infrastructure for distributed training at scale. (Source)
Clean Rooms ML: An offshoot of AWS’ existing Clean Rooms product, the service removes the need for AWS customers to share proprietary data with their outside partners to build, train and deploy AI models. You can train a private lookalike model across your collective data. (Source)
Amazon Neptune Analytics: It combines the best of both worlds– graph and vector databases– which has been a debate of sorts in AI circles about which database is more important in finding truthful information in generative AI applications. (Source)
Perplexity introduces PPLX online LLMs
Perplexity AI shared two new PPLX models: pplx-7b-online and pplx-70b-online. The online models are focused on delivering helpful, up-to-date, and factual responses, and are publicly available via pplx-api, making it a first-of-its-kind API. They are also accessible via Perplexity Labs, our LLM playground.
The models are aimed at addressing two limitations of LLMs today– freshness and hallucinations. The PPLX models build on top of mistral-7b and llama2-70b base models.
DeepMind’s AI tool finds 2.2M new crystals to advance technology
AI tool GNoME finds 2.2 million new crystals (equivalent to nearly 800 years’ worth of knowledge), including 380,000 stable materials that could power future technologies.
Modern technologies, from computer chips and batteries to solar panels, rely on inorganic crystals. Each new stable crystal takes months of painstaking experimentation. Plus, if they are unstable, they can decompose and wouldn’t enable new technologies.
Google DeepMind introduced Graph Networks for Materials Exploration (GNoME), its new deep learning tool that dramatically increases the speed and efficiency of discovery by predicting the stability of new materials. It can do at an unprecedented scale.
A-Lab, a facility at Berkeley Lab, is also using AI to guide robots in making new materials.
Meta’s new AI makes communication seamless in 100 languages
Meta has developed a family of 4 AI research models called Seamless Communication, which aims to remove language barriers and enable more natural and authentic communication across languages. Here are they:
It is the first publicly available system that unlocks expressive cross-lingual communication in real-time and allows researchers to build on this work.
Try the SeamlessExpressive demo to listen how you sound in different languages.
Today, alongside their models, they are releasing metadata, data, and data alignment tools to assist the research community, including:
Metadata of an extension of SeamlessAlign corresponding to an additional 115,000 hours of speech and text alignments on top of the existing 470k hours.
Metadata of SeamlessAlignExpressive, an expressivity-focused version of the dataset above.
Tools to assist the research community in collecting more datasets for translation.
NVIDIA researchers have integrated human-like intelligence into ADS
In this paper, the team of NVIDIA, Stanford, and USC researchers have released 'Agent-driver,' which integrates human-like intelligence into the driving system. It utilizes LLMs as a cognitive agent to enhance decision-making, reasoning, and planning.
Agent-Driver system includes a versatile tool library, a cognitive memory, and a reasoning engine. The system is evaluated on the nuScenes benchmark and outperforms existing driving methods significantly. It also demonstrates superior interpretability and the ability to learn with few examples. The code for this approach will be made available.
Mastercard introduces Muse AI for tailored shopping
Mastercard has launched Shopping Muse, an AI-powered tool that helps consumers find the perfect gift. AI will provide personalized recommendations on a retailer's website based on the individual consumer's profile, intent, and affinity.
Shopping Muse translates consumer requests made via a chatbot into tailored product recommendations, including suggestions for coordinating products and accessories. It considers the shopper's browsing history and past purchases to estimate future buying intent better.
That's all for now!
If you are new to ‘The AI Edge’ newsletter. Subscribe to receive the ‘Ultimate AI tools and ChatGPT Prompt guide’ specifically designed for engineering leaders and AI enthusiasts.
Thanks for reading, and see you on Monday. 😊