AI Weekly Rundown (July 8 to July 14)

News from Google, OpenAI, Anthropic, Microsoft, Stable Diffussion and more.

Jul 15, 2023

Hello, Engineering Leaders and AI Enthusiasts,

Another eventful week in the AI realm. Lots of big news from huge enterprises.

In today’s edition:

✅ Google & Microsoft battle to lead healthcare AI
✅ The impact of poisoning LLM supply chains
✅ How language models use long contexts
✅ AI can now send Bitcoins!
✅ Google & Stanford researchers use LLMs to solve Robotics challenges
✅ RLTF improves LLMs for Code generation
✅ Anthropic’s new Claude 2 rivals ChatGPT & Google Bard
✅ gpt-prompt-engineer takes AI to heights
✅ Elon Musk launches xAI to rival OpenAI
✅ Google introduces AI-powered NotebookLM & Bard updates
✅ Objaverse-XL's 10M+ dataset set to revolutionize AI in 3D
✅ Meta plans to dethrone OpenAI and Google
✅ OpenAI enters partnership to make ChatGPT smarter
✅ Stable Doodle: Next chapter in AI art

Let’s go!

Google & Microsoft battle to lead healthcare AI

Reportedly, Google’s Med-PaLM 2 (an LLM for the medical domain) has been in testing at the Mayo Clinic research hospital. In April, Google announced its limited access for select Google Cloud customers to explore use cases and share feedback to investigate safe, responsible, and meaningful ways to use it.

Meanwhile, Google’s rivals moved quickly to incorporate AI advances into patient interactions. Hospitals are beginning to test OpenAI’s GPT algorithms through Microsoft’s cloud service in several tasks. Google’s Med-PaLM 2 and OpenAI’s GPT-4 each scored similarly on medical exam questions, according to independent research released by the companies.

Source

The impact of poisoning LLM supply chains

LLMs are gaining massive recognition worldwide. However, no existing solution exists to determine the data and algorithms used during the model’s training. In an attempt to showcase the impact of this, Mithril Security undertook an educational project— PoisonGPT— aimed at showing the dangers of poisoning LLM supply chains.

It shows how one can surgically modify an open-source model and upload it to Hugging Face to make it spread misinformation while being undetected by standard benchmarks.

Mithril Security is also working on AICert, a solution to trace models back to their training algorithms and datasets which will be launched soon.

Source

How language models use long contexts

LLM vendors are fiercely competing to claim the title of having the biggest context window. Recently, Anthropic made headlines for expanding Claude’s context window from 100K tokens. But does a bigger context window always lead to better results?

New research finds significant insights as well as limitations related to large contexts. It reveals that

Language models often struggle to use information in the middle of long input contexts
Their performance decreases as the input context grows longer
The performance is often highest when relevant information occurs at the beginning or end of the input context

Source

AI can now send Bitcoins!

The recent introduction of AI tools by Lightning Labs allows AI applications to hold, send, and receive Bitcoin. The tools leverage Lightning Network, a second-layer payment network for faster and cheaper Bitcoin transactions. By integrating high-volume Bitcoin micropayments with popular AI software libraries like LangChain, Lightning Labs addresses the lack of a native Internet-based payment mechanism for AI platforms.

Source

Google & Stanford researchers use LLMs to solve Robotics challenges

Recent research has found that pre-trained LLMs can complete complex token sequences, including those generated by probabilistic context-free grammars (PCFG) and ASCII art prompts. The study explores how these zero-shot capabilities can be applied to robotics problems, such as extrapolating sequences of numbers to complete simple motions and prompting reward-conditioned trajectories to discover and represent closed-loop policies.

Although deploying LLMs for real systems is currently challenging due to latency, context size limitations, and compute costs, the study suggests that using LLMs to drive low-level control could provide insight into how patterns among words could be transferred to actions.

Source

RLTF improves LLMs for Code generation

Researchers have proposed a novel online reinforcement learning framework called RLTF for refining LLMs for code generation. The framework uses unit test feedback of multi-granularity to generate data in real time during training and guide the model toward producing high-quality code. The approach achieves SotA performance on the APPS and the MBPP benchmarks for their scale.

Source

Anthropic’s new Claude 2 rivals ChatGPT & Google Bard

Anthropic released Claude 2 model. It has improved coding abilities, with significantly higher scores on programming evaluations, has significantly improved math and reasoning compared to previous models, and can be accessed via API and a new public-facing beta website, claude.ai.

Key information:

Scored 76.5% on MCQ of the Bar exam
Scored >90% on GRE reading & writing score
Scored 71.2% on Python coding test
Claude 2 API offered at Claude 1.3 price for businesses
100k context window for writing
US and UK can use the beta chat experience from today

Source

gpt-prompt-engineer takes AI to heights

Introducing ‘gpt-prompt-engineer’ - a powerful tool for prompt engineering. It’s an agent that creates optimal GPT classification prompts. Uses GPT-4 and GPT-3.5-Turbo to generate and rank prompts based on test cases.

Just describe the task, and an AI agent will:

Generate many prompts
Test them in a tournament
Respond with the best prompt

The tool employs an ELO rating system to determine the effectiveness of each prompt. A specialized version is available for classification tasks, providing scores for each prompt. Optional logging to Weights & Biases facilitates experiment tracking. gpt-prompt-engineer revolutionizes prompt engineering, enabling users to optimize prompts for maximum performance.

Source

Elon Musk launches xAI to rival OpenAI

Elon Musk has launched his long-teased artificial intelligence startup, xAI. Its team comprises experts from the same tech giants (Google, Microsoft) that he aims to challenge in a bid to build an alternative to ChatGPT.

Musk also said that rather than explicitly programming morality into its AI, xAI will seek to create a "maximally curious" AI. In April, he had said that he would launch TruthGPT, or a maximum truth-seeking AI to rival Google's Bard and Microsoft's Bing AI that tries to understand the nature of the universe.

Source

Google introduces AI-powered NotebookLM & Bard updates

Google has started rolling out NotebookLM, an AI-first notebook grounded designed to use the power and promise of language models paired with your existing content to gain critical insights faster. It can summarize facts, explain complex ideas, and brainstorm new connections — all based on the sources you select.

It will be immediately available to a small group of users in the U.S. as Google continues to refine it.

(Source)

Google has also finally launched Bard in the European Union (EU) and Brazil. It is now available in more than 40 languages. Moreover, Bard has new features enabling it to speak its answers, respond to prompts that include images, and more.

(Source)

Objaverse-XL's 10M+ dataset set to revolutionize AI in 3D

New research from Stability AI (and others) has introduced Objaverse-XL, a large-scale web-crawled open dataset of over 10 million 3D objects. With it, researchers have trained Zero123-XL, a foundation model for 3D, observing incredible 3D generalization abilities (as shown below).

It shows significantly better zero-shot generalization to challenging and complex modalities, including photorealistic assets, cartoons, drawings, and sketches. Thus, the scale and diversity of assets in Objaverse-XL can significantly expand the performance of state-of-the-art 3D models.

Source

Meta plans to dethrone OpenAI and Google

Meta plans to release a commercial AI model to compete with OpenAI, Microsoft, and Google. The model will generate language, code, and images. It might be an updated version of Meta's LLaMA, which is currently only available under a research license.

Meta's CEO, Mark Zuckerberg, has expressed the company's intention to use the model for its own services and make it available to external parties. Safety is a significant focus. The new model will be open source, but Meta may reserve the right to license it commercially and provide additional services for fine-tuning with proprietary data.

Source

OpenAI enters partnership to make ChatGPT smarter

The Associated Press (AP) and OpenAI have agreed to collaborate and share select news content and technology. OpenAI will license part of AP's text archive, while AP will leverage OpenAI's technology and product expertise. The collaboration aims to explore the potential use cases of generative AI in news products and services.

AP has been using AI technology for nearly a decade to automate tasks and improve journalism. Both organizations believe in the responsible creation and use of AI systems and will benefit from each other's expertise. AP continues to prioritize factual, nonpartisan journalism and the protection of intellectual property.

Source

Stable Doodle: Next chapter in AI art

Stability AI, the startup behind Stable Diffusion, has released 'Stable Doodle,' an AI tool that can turn sketches into images. The tool accepts a sketch and a descriptive prompt to guide the image generation process, with the output quality depending on the detail of the initial drawing and the prompt. It utilizes the latest Stable Diffusion model and the T2I-Adapter for conditional control.

Stable Doodle is designed for both professional artists and novices and offers more precise control over image generation. Stability AI aims to quadruple its $1 billion valuation in the next few months.

Source

That's all for now!

If you are new to ‘The AI Edge’ newsletter. Subscribe to receive the ‘Ultimate AI tools and ChatGPT Prompt guide’ specifically designed for Engineering Leaders and AI enthusiasts.

Thanks for reading, and see you on Monday! 😊

The AI Edge

Discussion about this post

Ready for more?