Falcon topples LLaMA: Top open-source LM
Plus: Google brings methodology for training LLMs. OpenAI improving mathematical reasoning
Hello, Engineering Leaders and AI Enthusiasts,
Welcome to the 32nd edition of The AI Edge newsletter. In today’s edition, we bring you the Falcon 40B, UAE’s leading open-source AI model is now free to use. Thank you everyone who is reading this. 😊
In today’s edition:
🎉 Falcon topples LLaMA: Top open-source LM
💻 Google trains LM for software development
🔢 OpenAI improving mathematical reasoning
🎯 DPO to make RLHF obsolete?
🔮 Open AI’s future plans revealed by Sam Altman
Let’s go!
Falcon topples LLaMA: Top open-source LM
Falcon 40B, UAE’s leading large-scale open-source AI model from Technology Innovation Institute (TII), is now royalty-free for commercial and research use. Previously, it was released under a license requiring commercial royalty payments of 10%.
The model has been updated to Apache 2.0 software license, under which end-users have access to any patent covered by the software in question. TII has also provided access to the model’s weights to allow researchers and developers to use it to bring their innovative ideas to life.
Ranked #1 globally on Hugging Face’s Open LLM leaderboard, Falcon 40B outperforms competitors like Meta’s LLaMA, Stability AI’s StableLM, and RedPajama from Together.
Why does this matter?
Royalty-free deployments of Falcon 40B could empower public and private sector entities with efficiencies such as faster project starts, faster iterations, more flexible software development processes, robust community-driven support, and easier license management.
Google trains LM for software development
Google Research has developed DIDACT (Dynamic Integrated Developer ACTivity), a methodology for training large ML models for software development.
What sets it apart is that it uses the entire process of software development– editing code, fixing builds, doing end-to-end code review, etc.– as the source of training data for the model, and not just the polished end state of that process– the finished code.
DIDACT turns Google's software development process into training demonstrations for ML developer assistants. And it uses those demonstrations to train models that construct code in a step-by-step fashion, interactively with tools and code reviewers.
Why does this matter?
Perhaps, DIDACT is the first step toward a general-purpose developer-assistance agent. It may pave a promising path toward developing agents that can generally assist across the software development process. And the DIDACT approach complements the great strides taken by LLMs at Google and elsewhere towards technologies that ease toil, improve productivity, and enhance the quality of work of software engineers.
Source
Open AI’s latest idea can help models do math with 78% accuracy
Even SoTA models today are prone to hallucinations, which can be particularly problematic in domains that require multi-step reasoning. To train more reliable models, OpenAI trained a model by rewarding each correct step of reasoning (“process supervision”) instead of simply rewarding the correct final answer (“outcome supervision”).
It was found that process supervision significantly outperforms outcome supervision for training models to solve problems from challenging MATH datasets. The model in the experiment solved 78% of problems from a representative subset of the MATH test set.
Additionally, process supervision also has an important alignment benefit: it directly trains the model to produce a chain-of-thought that is endorsed by humans.
Why does this matter?
It is unknown how broadly these results will generalize beyond the domain of matha. But this can be important for future work to explore the impact of process supervision in other domains. If these results generalize, we may find that process supervision gives us the best of both worlds – a method that is both more performant and more aligned than outcome supervision.
DPO to make RLHF obsolete?
Direct Preference Optimization (DPO) aims to improve the control of large language models. It proposes an alternative to reinforcement learning with human feedback (RLHF), used to finetune LLMs like ChatGPT on human instructions. According to the researchers' benchmark, it's more efficient to use DPO and is often also preferred over RLHF/PPO regarding response quality.
Why does this matter?
DPO appears to be a more straightforward and effective way to make language models behave according to our preferences. Because It is efficient, stable, and requires less computational resources. It doesn't need a reward model or extensive fine-tuning like existing methods. The experiments show that DPO can align the model with human preferences and improve text quality in different tasks.
OpenAI: 1M tokens context window is coming very soon & much more by Sam Altman
In an exclusive meeting last week, renowned entrepreneur and OpenAI CEO Sam Altman discussed with 20 developers about OpenAI's APIs and their forthcoming product strategies. Altman's transparency during the session allowed for covering practical developer issues and bigger-picture questions related to OpenAI’s mission and the societal impact of AI.
Here are the crucial insights gleaned from this insightful exchange:
OpenAI facing extremely GPU limitations at present & the biggest customer pain was about the reliability and speed of the API.
1M tokens context window is coming very soon (this year).
In 2023, OpenAI’s near-term roadmap will focus on developing a cheaper and faster GPT-4, extending the finetuning API to the latest models, and introducing a stateful API that remembers conversation history. Multimodality is planned for 2024, pending the availability of more GPUs.
Sam indicated that ChatGPT plugins are probably not coming to the API anytime soon.
OpenAI avoids competing with customers, except in ChatGPT.
OpenAI considers open-sourcing GPT-3 and emphasizes the importance of open-source contributions. Hosting and serving LLM still remains a challenge.
Scaling laws for model performance continue to hold, and making models larger will continue to yield performance.
Why does this matter?
Considering the insights shared in this discussion allows individuals and organizations to understand AI's challenges, opportunities, and future directions. Chances of OpenAI would go even faster if it could get its hands on more GPUs.
What Else Is Happening
🕺 AI can make statues dance! (Link)
🔍 Amazon trains AI to weed out damaged goods (Link)
🚀 Snapchat launches new generative AI feature, ‘My AI Snaps’ (Link)
🛒 Instacart launches in-app AI search tool powered by ChatGPT (Link)
💸 China's Baidu launches $145 million venture capital AI fund (Link
🛡️ Leveraging AI and ML to protect and validate relevant patient data with Dell (Link)
Trending Tools
Recordme AI: Automate financial management, chatbot interactions, and get customized reports. Boost efficiency and ROI.
Teleprompter: Enhance videos with AI Video Captions! Effortlessly generate accurate captions for accessibility and engagement.
Shortkut: Chrome extension with AI-powered shortcuts for internet input boxes. Access generative AI at your fingertips.
Exact Science: AI rank tracking, content scoring, and history for enhanced SEO. Centralize brand management, and integrate GSC.
STRATxAI: No-code AI investment platform to build strategies. Boost investing skills with the SAM chatbot.
Deeto AI: Streamline managing client references. Invite, create, manage, and grow with AI-driven Deeto.
Planit Earth: Personalized itineraries with generative AI. Tailored recommendations based on destination, trip length, and budget.
Predibase: Low-code AI platform for building and optimizing models. Simplifies model development for engineers and data scientists.
That's all for now!
If you are new to ‘The AI Edge’ newsletter. Subscribe to receive the ‘Ultimate AI tools and ChatGPT Prompt guide’ specifically designed for Engineering Leaders and AI enthusiasts.
Thanks for reading, and see you tomorrow.