Apple’s MGIE: Making Sky Bluer With Each Prompt!

Plus: Meta will label AI-generated images on its platforms, and Smaug-72B: The king of open-source AI is here!

Feb 07, 2024

Hello Engineering Leaders and AI Enthusiasts!

Welcome to the 205th edition of The AI Edge newsletter. This edition brings you Apple’s MGIE and how it is changing instruction-based image editing.

And a huge shoutout to our incredible readers. We appreciate you😊

In today’s edition:

🖌️ Apple’s MGIE: Making sky bluer with each prompt
🏷️ Meta will label your content if you post an AI-generated image
👑 Smaug-72B: The king of open-source AI is here!
💳 Knowledge Nugget: LLMs are the variable interest credit card of tech debt by
Vikram Sreekanti
and
Joseph E. Gonzalez

Let’s go!

Apple’s MGIE: Making the sky bluer with each prompt!

Apple released a new open-source AI model called MGIE(MLLM Guided Image Editing). It has editing capabilities based on natural language instructions. MGIE leverages multimodal large language models to interpret user commands and perform pixel-level image manipulation. It can handle editing tasks like Photoshop-style modifications, optimizations, and local editing.

MGIE integrates MLLMs into image editing in two ways. First, it uses MLLMs to understand the user input, deriving expressive instructions. For example, if the user input is “make sky more blue,” the AI model creates an instruction, “increase the saturation of sky region by 20%.” The second usage of MLLM is to generate the output image.

Why does this matter?

MGIE from Apple is a breakthrough in the field of instruction-based image editing. It is an AI model focusing on natural language instructions for image manipulation, boosting creativity and accuracy. MGIE is also a testament to the AI prowess that Apple is developing, and it will be interesting to see how it leverages such innovations for upcoming products.

Source

Meta will label your content if you post an AI-generated image

Meta is developing advanced tools to label metadata for each image posted on their platforms like Instagram, Facebook, and Threads. Labeling will be aligned with “AI-generated” information in the C2PA and IPTC technical standards. These standards will allow Meta to detect AI-generated images from other platforms like Google, OpenAI, Microsoft, Adobe, Midjourney, and Shutterstock.

Meta wants to differentiate between human-generated and AI-generated content on its platform to reduce misinformation. However, this tool is also limited, as it can only detect still images. So, AI-generated video content still goes undetected on Meta platforms.

Why does this matter?

The level of misinformation and deepfakes generated by AI has been alarming. Meta is taking a step closer to reducing misinformation by labeling metadata and declaring which images are AI-generated. It also aligns with the European Union’s push for tech giants like Google and Meta to label AI-generated content.

Source

Smaug-72B: The king of open-source AI is here!

Abacus AI recently released a new open-source language model called Smaug-72B. It outperforms GPT-3.5 and Mistral Medium in several benchmarks. Smaug 72B is the first open-source model with an average score of over 80 in major LLM evaluations. According to the latest rankings from Hugging Face, It is one of the leading platforms for NLP research and applications.

Smaug 72B is a fine-tuned version of Qwn 72B, a powerful language model developed by a team of researchers at Alibaba Group. It helps enterprises solve complex problems by leveraging AI capabilities and enhancing automation.

Why does this matter?

Smaug 72B is the first open-source model to achieve an average score of 80 on the Hugging Face Open LLM leaderboard. It is a breakthrough for enterprises, startups, and small businesses, breaking the monopoly of big tech companies over AI innovations.

Source

Enjoying the daily updates?

Refer your pals to subscribe to our daily newsletter and get exclusive access to 400+ game-changing AI tools.

Refer a friend

When you use the referral link above or the “Share” button on any post, you'll get the credit for any new subscribers. All you need to do is send the link via text or email or share it on social media with friends.

Knowledge Nugget: LLMs are the variable interest credit card of tech debt

In this article,

Vikram Sreekanti

and

Joseph E. Gonzalez

discuss how Silicon Valley companies leverage LLMs as the variable-interest credit card of tech debt, and it’s impact on software development.

Here are several reasons why the LLMs are considered the new-age credit cards of tech debt:

Quick start and results with LLMs make them lucrative but give prolonged development
Experimentation with prompts and hyperparameters demands time commitments
Applications of LLM go beyond a single API call, increasing experimentation costs and slowing down the development process.
Keeping up with new research, models, and best practices becomes challenging for any enterprise.
Integrating third-party services is recommended over building AI features from scratch, as the latter leads to a prolonged development cycle.

It highlights how LLMs can prove challenging if enterprises consider building AI models from scratch. Further, both authors recommend enterprises prioritize third-party services for faster development and continuous improvements.

Why does this matter?

Whether to develop an indigenous AI model or leverage an existing one has always been debated among CTOs, CEOs, and CFOs. This article explains how many enterprises consider LLMs bankable entities and develop AI models from scratch. However, building an AI model from scratch can lead to slower, prolonged development with higher experimentation costs.

Source

What Else Is Happening❗

🧱OpenAI introduces watermarks to DALL-E 3 for content credentials.

OpenAI has added watermarks to the image metadata, enhancing content authenticity. These watermarks will distinguish between human and AI-generated content verified through websites like “Content Credentials Verify.” Watermarks will be added to images from the ChatGPT website and DALL-E 3 API, which will be visible to mobile users starting February 12th. However, the feature is limited to still images only. (Link)

🤳Microsoft introduces Face Check for secure identity verification.

Microsoft has unveiled “Face Check,” a new facial recognition feature, as part of its Entra Verified ID digital identity platform. Face Check provides an additional layer of security for identity verification by matching a user's real-time selfie with their government ID or employee credentials. Azure AI services power face check and aims to enhance security while respecting privacy and compliance through a partnership approach. Microsoft's partner BEMO has already implemented Face Check for employee verification. (Link)

⬆️ Stability AI has launched an upgraded version of its Stable Video Diffusion (SVD).

Stability AI has launched SVD 1.1, an upgraded version of its image-to-video latent diffusion model, Stable Video Diffusion (SVD). This new model generates 4-second, 25-frame videos at 1024x576 resolution with improved motion and consistency compared to the original SVD. It is available via Hugging Face and Stability AI subscriptions. (Link)

🔍CheXagent has introduced a new AI model for automated chest X-ray interpretation.

CheXagent, developed in partnership with Stability AI by Stanford University, is a foundation model for chest X-ray interpretation. It automates the analysis and summary of chest X-ray images for clinical decision-making. CheXagent combines a clinical language model, a vision encoder, and a network to bridge vision and language. CheXbench is available to evaluate the performance of foundation models on chest X-ray interpretation tasks. (Link)

🤝LinkedIn launched an AI feature to introduce users to new connections.

LinkedIn launched a new AI feature that helps users start conversations. Premium subscribers can use this feature when sending messages to others. The AI uses information from the subscriber’s and the other person's profiles to suggest what to say, like an introduction or asking about their work experience. This feature was initially available for recruiters and has now been expanded to help users find jobs and summarize posts in their feeds. (Link)

New to the newsletter?

The AI Edge keeps engineering leaders & AI enthusiasts like you on the cutting edge of AI. From ML to ChatGPT to generative AI and LLMs, We break down the latest AI developments and how you can apply them in your work.

Thanks for reading, and see you tomorrow. 😊

The AI Edge

Apple’s MGIE: Making Sky Bluer With Each Prompt!

Plus: Meta will label AI-generated images on its platforms, and Smaug-72B: The king of open-source AI is here!

Apple’s MGIE: Making the sky bluer with each prompt!

Meta will label your content if you post an AI-generated image

Smaug-72B: The king of open-source AI is here!

Enjoying the daily updates?

Knowledge Nugget: LLMs are the variable interest credit card of tech debt

What Else Is Happening❗

New to the newsletter?

Discussion about this post