Meta’s TRIBE v2 can predict how your brain responds to videos and sounds

Meta built a model that predicts how your brain responds to what you see and hear. They also own the world’s largest advertising platform.

meta brain ai model

Meta AI released TRIBE v2, an updated brain encoder model that predicts how human brains respond to video, audio, and text. When you give it a piece of content, it tells you which brain regions activate and how strongly.

The original TRIBE won a major neuroscience competition last year, placing first out of 262 teams. v2 is significantly more capable.

  • Training data: 500 hours of fMRI recordings from over 700 participants.
  • Zero-shot generalization: Predicts brain responses for people it has never seen, across new languages and new tasks.
  • Performance: 2 to 3 times better than prior methods on unseen subjects.
  • Accuracy: Explains up to 54% of explainable brain activity variance in sensory areas.

The model processes three modalities at once. Video goes through Meta’s V-JEPA 2, audio through Wav2Vec2-BERT, and text through Llama 3.2. Together they produce predictions across roughly 1,000 brain regions.

Meta is releasing the model, codebase, and a live demo openly. The stated applications are neuroscience research, brain-inspired AI architectures, and neurological disease simulation without constant patient scanning.

Here is the part worth thinking about.

Meta owns Facebook, Instagram, and WhatsApp. Their business is built on understanding what content drives engagement. When you have a model that predicts neural responses to combinations of video, audio, and text, you also have a tool for designing content that triggers the strongest possible biological response in viewers.

  • What Meta says: The research team frames this entirely as neuroscience. No commercial application is suggested.
  • What the researchers acknowledge: The dual-use concern is real. They flag it themselves in the paper.
  • The context: A model trained to predict what content activates which brain regions, built by a company whose revenue depends on maximizing time spent on screens, does not need to be sinister to be worth watching.

Bottom line: A model that predicts brain responses to content, released by the company that sells ads against that content, does not need to be sinister to be worth watching.

Source: Meta

RunPod
RunPod

If you need on-demand GPUs for training, fine-tuning, inference, or running open-source models, give RunPod a try.

  • Available hardware: H100, H200, A100, L40S, RTX 4090, RTX 5090, and 30+ more
  • Cost: significantly cheaper than AWS or GCP, billed per second, no contracts
  • Setup: spins up in under a minute, 30+ regions worldwide
Try RunPod →
Affiliate disclosure: We may earn a commission if you sign up via our link, at no extra cost to you.
Efficienist Newsletter

Get the core business tech news delivered straight to your inbox. We track AI, automation, SaaS, and cybersecurity so you don't have to.

Just read what you want, and be done with it.

Read Next