GPT-4 API & Code Interpreter Available Now!
Plus: Salesforce’s CodeGen2.5 is small but mighty. InternLM: A model tailored for practical scenarios.
Hello Engineering Leaders and AI Enthusiasts!
Welcome to the 58th edition of The AI Edge newsletter. This edition brings you OpenAI’s enabling of general availability for GPT-4 and Code Interpreter.
A huge shoutout to our amazing readers. We appreciate you! 😊
In today’s edition:
🌍 OpenAI makes GPT-4 and Code Interpreter available
🧠 Salesforce’s CodeGen2.5, a small but mighty code LLM
🤖 InternLM: A model tailored for practical scenarios
📚 Knowledge Nugget: Models generating training data: huge win or fake win? by
Let’s go!
OpenAI makes GPT-4 API and Code Interpreter available
GPT-4 API is now available to all paying OpenAI API customers. GPT-3.5 Turbo, DALL·E, and Whisper APIs are also now generally available, and OpenAI is announcing a deprecation plan for some of the older models, which will retire beginning of 2024.
(Source)
Moreover, OpenAI’s Code Interpreter will be available to all ChatGPT Plus users over the next week. It lets ChatGPT run code, optionally with access to files you've uploaded. You can also ask ChatGPT to analyze data, create charts, edit files, perform math, etc.
(Source)
Why does this matter?
Making GPT-4 API accessible will democratize AI, accelerate innovation in AI, and lead to a wider range of AI-driven solutions for various domains and industries. Along with Code interpreter, it could ease data analysis and also enable developers to focus more on application-specific tasks and increase efficiency in AI development.
Salesforce’s CodeGen2.5, a small but mighty code LLM
Salesforce’s CodeGen family of models allows users to “translate” natural language, such as English, into programming languages, such as Python. Now it has added a new member- CodeGen2.5, a small but mighty LLM for code. Here’s a tl;dr
Its smaller size means faster sampling, resulting in a speed improvement of 2x compared to CodeGen2. The small model easily allows for personalized assistants with local deployments.
Why does this matter?
CodeGen 2.5 shows that small models can perform surprisingly good when trained well. While the dominant trend has been to scale up LLMs to larger sizes, this could challenge the notion that they perform better and further emphasize other factors such as computational efficiency, quality of training data, etc.
InternLM: A model tailored for practical scenarios
InternLM has open-sourced a 7B parameter base model and a chat model tailored for practical scenarios. The model
Leverages trillions of high-quality tokens for training to establish a powerful knowledge base
Supports an 8k context window length, enabling longer input sequences and stronger reasoning capabilities
Provides a versatile toolset for users to flexibly build their own workflows
It is a 7B version of a 104B model that achieves SoTA performance in multiple aspects, including knowledge understanding, reading comprehension, mathematics, and coding. InternLM-7B outperforms LLaMA, Alpaca, and Vicuna on comprehensive exams, including MMLU, HumanEval, MATH, and more.
Why does this matter?
This small-scale open-source version seems to compete with other companies’ (such as OpenAI) offerings on the Chinese market. In other news, China’s Alibaba and Huawei have showcased new AI products to jostle for position in the global AI race. Apparently, Chinese tech organizations are aggressively developing AI products after ChatGPT ignited a generative AI boom.
Knowledge Nugget: Models generating training data: huge win or fake win?
We’ve seen a lot of papers claiming you can use one language model to generate useful training data for another language model. But is it a huge or a fake win for us?
In this intriguing article,
attempts to answer this. The article explores the tension between empirical gains from generated training data and data processing inequality. The article also presents various examples and studies demonstrating both the benefits and limitations of training data generation. And it proposes that the key to understanding the effectiveness lies not in the model generating the data but in the filtering process. And much more.Why does this matter?
The article offers a thought-provoking perspective on training data generation, filtering techniques, and the relationship between models and data. It can expand the understanding of AI practitioners and stimulate critical thinking in the realm of language model training and data generation.
What Else Is Happening❗
🧑🎨Ameca draws with the help of Stable Difussion and also explains how! (Link)
🔍AWS Docs GPT: AI-powered search and chat for AWS documentation (Link)
🎨Alibaba unveils an image generator to take on Midjourney and DALL-E (Link)
💰DigitalOcean acquires cloud computing and AI startup Paperspace for $111M (Link)
📈AI-powered innovation could create over £400B in economic value for UK by 2030 (Link)
🌍A Standford study finds AI Agents that “self-reflect” perform better in changing environments (Link)
🛠️ Trending Tools
Photocode: AI-powered code analyzer and debugger for 20+ languages from code photos.
Amigo AI: Versatile AI chatbot assistant with useful features for efficiency and fun.
Traw: AI-powered audio and video summarization service for effortless content organization.
PRST AI: Enhance AI model performance with prompt management and tailored prompt generation.
Tutory: AI tutor adapting to each student's learning style, providing accessible education.
Spellmint: AI tool for streamlined agile workflows in planning, building, and analyzing tasks.
Songbot AI: Text-to-Vocals app generates epic music videos with smooth vocals and dope lyrics.
Notemonkey: AI-powered tool transforming disorganized thoughts and ideas into clear summaries.
That's all for now!
If you are new to ‘The AI Edge’ newsletter. Subscribe to receive the ‘Ultimate AI tools and ChatGPT Prompt guide’ specifically designed for Engineering Leaders and AI enthusiasts.
Thanks for reading, and see you Monday. 😊