[#27] A new kind of neural net based on Fourier Transforms

Plus Meta's new research, new techniques to mitigate hallucinations, and generating images from brainwaves

In partnership with

Hello readers, in this issue we cover

  • Meta’s Movie Gen generates movies from text

  • Meta researchers use RAG to mitigate hallucinations from continually-pretrained LLMs

  • Fourier Analysis Networks, a new kind of neural net

  • A novel, training-free technique to mitigate hallucinations

  • Generating images from brainwaves

The Daily Newsletter for Intellectually Curious Readers

  • We scour 100+ sources daily

  • Read by CEOs, scientists, business owners and more

  • 3.5 million subscribers

🎥 Meta’s Movie Gen generates movies from text

Movie created with Movie Gen

Meta announced the Movie Gen, a group foundation models that generate high-quality 1080p HD videos with various aspect ratios and synchronized audio. It also allows precise instruction-based video editing and personalized videos from user images. These models set new benchmarks in text-to-video, video personalization, editing, and video/audio generation. The largest model is a 30-billion-parameter transformer capable of generating 16-second videos at 16 frames-per-second. The work includes innovations in model architecture, data handling, and optimization techniques, aiming to advance media generation research.

∿ Fourier Analysis Networks, a new kind of neural net

FANs vs MLP

While neural networks like MLP and Transformer have seen huge success, they have a noticeable flaw when it comes to understanding periodic patterns. Instead of truly grasping the underlying principles, they tend to just memorize the patterns. This is a problem because periodicity—recurring patterns—is key to making predictions in both natural and engineered systems. To solve this, researchers created FAN, a new network architecture built on Fourier Analysis, which naturally incorporates periodicity into its structure. FAN offers a more accurate and efficient way to model these repeating patterns, using fewer resources than MLPs. They show FAN's effectiveness across various tasks, including time series forecasting, language modeling, and symbolic formula representation, proving it to be a strong alternative to traditional networks.

👨🏻‍🔬 A novel, training-free technique to mitigate hallucinations

Multimodal Large Language Models (MLLMs) can sometimes "hallucinate," meaning they confidently make up information not found in the visuals they're analyzing. To fix this, researchers introduced a method called Memory-space Visual Retracing (MEMVR). It mimics how people recheck details they've forgotten by looking again. MEMVR reinjects visual prompts into the model’s memory when it's uncertain, helping reduce mistakes without needing extra training or knowledge. Tests show that MEMVR effectively cuts down hallucinations without slowing things down, making it useful for many applications.

🧠 Meta researchers use RAG to mitigate hallucinations from continually-pretrained LLMs

This paper presents new methods that have the potential to improve privacy process efficiency with LLM and RAG. To reduce hallucination, researchers continually pre-train the base LLM model with a privacy-specific knowledge base and then augment it with a semantic RAG layer. Evaluations demonstrate that this approach enhances the model performance (as much as doubled metrics compared to out-of-box LLM) in handling privacy-related queries, by grounding responses with factual information which reduces inaccuracies.

🧠 Generating images from brainwaves

Generating images from brain waves is gaining interest for improving brain-computer interface (BCI) systems by understanding how brain signals represent visual information. While most research has used fMRI (a detailed but expensive imaging method), it's not suitable for real-time use. EEG, a cheaper, portable, and non-invasive option, is better for real-time applications, but its low quality and noise make image generation harder. This paper introduces a new method using EEG signals to generate images more easily and efficiently with fewer steps than other methods. Their approach outperforms current models, and the code will be available after publication.

🤯 Today I Learned

Every issue, we highlight new AI concepts and terminology to help educate our readers. This issue we learned about:

Fourier Analysis Networks

A Fourier Analysis Network is a neural network that integrates Fourier analysis, which breaks down signals into their frequency components. The Fourier transform is used to switch between time (or spatial) and frequency domains, revealing patterns in the data. In neural networks, this allows for better handling of periodic or oscillatory patterns by using Fourier features or embeddings. Some networks include Fourier layers that process data before feeding it into standard layers, improving their ability to manage high-frequency information. These networks are useful in areas like signal processing, image analysis, and physics simulations, particularly when dealing with repeating patterns or wave-like behavior.