General Intelligence
Posts
Issue #11: How Meta recommends short form videos

Issue #11: How Meta recommends short form videos

Plus Snowflake's Arctic-TILT LLM architecture, achieving better citations than ChatGPT, stopping adversarial AI attacks at the edge, and achieving better alignment with LLMs

Chris Han
August 12, 2024

Hello readers! In today’s issue we cover:

Meta creates a new recommendation system used in short-form videos
How Snowflake built Arctic-TILT, it’s document understanding LLM
Researchers in China developed a new framework that generates better citations better than ChatGPT
EdgeShield, a new framework that can stop adversarial AI attacks on edge devices
Meta researchers release new method, called back-and-forth instruction tuning, to better align LLMs

👍 Meta’s New Recommendation System Used in Short Form Videos

Models recommend more popular items, despite users interacting frequently with less popular ones

Researchers at Meta created a new method for recommendation systems. Typically, a recommendation system faces three main challenges:

Light users: Users who don't interact much with the platform are hard to recommend for due to data sparsity.
Heavy users: Active users who use the platform have complex, varied interests that are difficult to capture accurately.
Technical constraints: The system needs to work quickly and efficiently for millions of users.

To solve these issues, the researchers added an "interest" layer between users and items (like videos or posts). This helps in several ways:

It connects light users to heavy users with similar interests, improving recommendations for less active users.
It better captures the complex interests of heavy users.
It makes the system more efficient and able to handle large amounts of data quickly.

The researchers tested their method on public datasets and found it performed better than existing systems while being more computationally efficient. They've also implemented this approach in several products at Meta (Facebook's parent company), particularly for recommending short-form videos.

Read the research paper

❄️ Snowflake Releases Their Arctic-TILT Research for Document Understanding

Arctic-TILT encoder architecture

Three months ago, Snowflake wrote a blog of their Arctic-TILT model. An compact model built into their Document AI feature that extracts data from documents and allows users to Q&A with it.

Now, they’ve released the paper detailing their architecture which is a encoder-decoder model and fuses text and vision models.

Read the research paper

💬 Generating Better Citations than ChatGPT and Mitigating Hallucinations

FRONT frameworks two-stage training recipe

Researchers introduce a new training framework, called FRONT, that outperforms ChatGPTs citation quality. The framework is designed to teach LLMs to generate fine-grained grounded citations, and generates superior grounded responses with highly supportive citations.

The research works towards mitigating hallucinations and allows users to confidently verify responses.

Read the research paper

👾 Stopping Attacks on AI Systems at the Edge

Adversarial attacks on AI systems are increasingly common. The current methods of defending against these attacks are resource-intensive, requiring substantial computing power and back-end processing—challenges that make real-time defense nearly impossible.

Recent breakthroughs in edge computing enable deploying neural networks directly on edge devices, paving the way for more nimble and efficient solutions. Leveraging these advancements, researchers have developed EdgeShield, a framework designed to universally and efficiently detect adversarial attacks.

During EdgeShield tests, it was found to be effective in spotting adversarial threats at reduced complexity and cost. This type of framework could be the key to keeping AI systems secure without the hefty price tag.

Read the research paper

📏 Meta researchers release new method to better align LLMs

Meta and University of Washington researchers release new method, called instruction back-and-forth translation, to construct high-quality synthetic data. The data is curated, backtranslated, then used to fine-tuned LLMs resulting in better alignment, which ensures and LLMs behavior is inline with human values, intentions, and goals.

Read the research paper

🤯 Today I Learned

Alignment

Alignment refers to the process of ensuring that an AI system's behavior and outputs are in line with human values, intentions, and goals.

Alignment aims to address several key aspects:

Safety: Ensuring the AI system doesn't cause unintended harm or behave in dangerous ways.
Ethics: Making sure the AI's actions and decisions align with human ethical principles.
Usefulness: Optimizing the AI to provide genuinely helpful responses and perform tasks as intended by its human users.
Reliability: Reducing the likelihood of the AI producing false, misleading, or inconsistent information.
Controllability: Maintaining human control over the AI system's capabilities and actions.

Backtranslation

Backtranslation is a technique used to improve language models and translations. It involves translating a sentence from one language to another, then translating it back to the original language. The difference between the original and back-translated sentences helps identify errors or refine the translation model. This process also creates new training data, helping models learn to handle different phrasings. Overall, it enhances the accuracy and flexibility of language processing systems.

Encoder Block

An encoder block is a key part of some neural networks that helps in processing and understanding data. In models like Transformers, it helps the network focus on important parts of a sequence, like understanding the meaning of words in a sentence. For autoencoders, the encoder block simplifies complex data into a smaller, more useful form, like compressing a photo into a tiny file. It's like a tool that turns messy, raw data into something easier to work with. Overall, the encoder block helps the network learn and make better decisions.