General Intelligence
Posts
Issue #3: Microsoft creates a text-to-speech model that sounds like a human

Issue #3: Microsoft creates a text-to-speech model that sounds like a human

Plus AI protein design, secure code generation, mixture-of-experts, and multi-agent systems

Chris Han
July 15, 2024

Lots of exciting news to share today. In today’s issue, we cover

Microsoft’s text-to-speech model, VALL-E2, reaches human parity (audio samples below)
Nvidia researchers introduce new reinforcement learning technique called Gradient Boosting Reinforcement Learning
Ex-Meta scientists debut gigantic AI protein design model that’s already designed new fluorescent proteins
Prompting techniques for secure code generation
A survey about Mixture of Expert models
AntGroup (subsidiary of Alibaba) create the PEER framework to give multi-agents expertise

🗣️ Microsoft reaches human-parity in text-to-speech models

Microsoft announced a text-to-speech model which achieves human parity for the first time. Using only a 3-second recording, VALL-E 2 consistently synthesizes high-quality speech, even for sentences that are traditionally challenging due to their complexity or repetitive phrases. The advantages of this work could contribute to valuable endeavors, such as generating speech for individuals with aphasia or people with amyotrophic lateral sclerosis. Listen to the audio samples

arxiv

⛰️ Nvidia researchers introduce Gradient Boosting Reinforcement Learning (GBRL)

Neural networks (NN) excel in many tasks but often struggle with being easy to understand, handling categorical data, and running efficiently on small devices. Gradient Boosting Trees (GBT), however, are good at these things and are widely used in many applications. Traditionally, GBTs haven't been used much in reinforcement learning (RL). This paper introduces Gradient-Boosting RL (GBRL), which adapts GBTs for RL. GBRL uses efficient tree-sharing methods, performs well on various tasks, and integrates easily with existing RL tools. This approach offers a new, effective option for RL, especially with structured data.

🧬 Ex-Meta scientists debut gigantic AI protein design model

Ex-Meta scientists have developed one of the largest AI models for biology, called ESM3, which can design new proteins. This model has already created new fluorescent proteins and received $142 million in funding to further its applications in drug development and sustainability. The AI was trained on billions of protein sequences and can generate proteins with specific properties. The team aims to make biology more programmable, offering a significant advancement in synthetic biology. Read the article on Nature.

A structural model of green fluorescent protein, a workhorse of biotechnology

🔐 Prompting Techniques for Secure Code Generation

What are the existing prompting techniques that can be used for code generation? What is the impact of different prompting techniques on the security of LLM-generated code?

This paper investigates those questions, and finds that a prompting technique called Recursive Criticism and Improvement can significantly reduce security weaknesses in LLM-generated code across all tested models. arxiv

Taxonomy of prompting techniques for code generation

👩‍🍳 A Survey on Mixture of Experts (MoE)

This is a good starting point to understand the landscape of Mixture of Experts models.

This paper provides systematic and comprehensive review of the literature on Mixture-of-Experts models(MoE), a technique where multiple expert networks (learners) are used to divide a problem space into homogeneous regions. It reviews the designs for various MoE models including their systemic and algorithmic aspects, open source implementations, and potential directions for future research. arxiv

Chronological order of MoE models

📕 Researchers give multi-agent systems expertise

Researchers at AntGroup simulate the collaborative processes of human experts using multiple agents, achieving comparable interpretative results. This approach is encapsulated in the Plan, Execute, Express and Review (PEER) framework, where domain specific (e.g. financial) tasks are divided into these four steps. Each agent specializes in a single task and working together to accomplish the overall objective. arxiv

Example execution of the PEER framework

🤯 Today I Learned

Every issue, we highlight new AI concepts and terminology to help educate our readers. This issue we learned about:

Mixture-of-Experts

A Mixture of Experts (MoE) model is an advanced machine learning architecture that combines multiple specialized neural networks, called "experts," with a gating mechanism to solve complex tasks. More concisely, MoE models:

Structure: consist of multiple "expert" neural networks and a gating network.
Experts: Each expert specializes in handling specific types of inputs or subtasks.
Gating network: This component decides which expert(s) to use for a given input.
Functionality: When given an input, the gating network activates the most appropriate expert(s) to process it.
Efficiency: MoE models can handle diverse tasks more efficiently than single large networks by distributing computation across experts.

This architecture allows for more flexible and efficient processing of complex tasks by leveraging specialized components.

Recursive Criticism and Improvement

Recursive Criticism and Improvement (RCI) is a method where a language model (LLM) generates an initial solution to a task, then critiques its own output, identifies errors, and generates an improved solution based on that critique. This process is repeated iteratively, allowing the model to refine its performance and improve accuracy over time, significantly enhancing its ability to execute computer tasks and reasoning tasks.

Gradient Boosting

Gradient boosting is a machine learning technique used to improve the accuracy of models by combining the predictions of multiple weak learners, usually decision trees, into a single strong learner. It works by sequentially adding trees, each one correcting the errors of the previous ones. This iterative process focuses on the data points that the previous models misclassified, gradually improving the model's performance. Gradient boosting is widely used for tasks like regression and classification due to its high predictive accuracy.