Why All Models Learn the Same Thing with Phillip Isola (MIT)

The Information Bottleneck

0:00

-1:11:28

Why All Models Learn the Same Thing with Phillip Isola (MIT)

Ravid Shwartz Ziv

Jul 02, 2026

Phillip Isola, professor at MIT, joins us to talk about representation learning: what makes a representation good, why different models seem to converge on similar representations, and whether pre-training is really over.

We discuss the platonic representation hypothesis and its limits, why clustering structure matters more than global geometry, and Phillip's new neural thickets paper arguing that post-training is easier than people think because pre-trained weights already sit near solutions to downstream tasks. Phillip also explains why he thinks LLMs are already world models, why he's betting on RNNs making a comeback, and why his most exciting current direction is artificial life: putting LLM agents in open environments with no fixed task and studying them like new organisms.

Timeline:

00:00 Intro song
00:13 Intro
01:05 What is representation learning and why it matters
04:09 What makes a representation good: minimality and sufficiency
10:03 How cross entropy and contrastive learning shape representations
14:35 Dimensionality reduction and why dimension isn't the right complexity measure
16:35 Compression and geometric clustering during training
19:27 The platonic representation hypothesis and what actually converges
22:53 Local neighborhoods vs global structure: the Aristotelian follow-up
24:33 When convergence is strong: truth vs the space of possibility
28:09 Is there true similarity in the world? The Bouba-Kiki effect
30:56 World models vs autoregressive LLMs
32:14 Diffusion LLMs as a special case of autoregressive models
33:42 What architectures win in five years: the case for RNNs
36:11 Grad student descent, or do we actually have principles?
40:51 Feathers and wings: what to take from biology
43:17 How close are we to brain-like models? Marr's three levels
47:01 Are better models becoming less human-like?
49:38 Is pre-training all you need? The neural thickets paper
54:18 LoRA, low rank fine-tuning, and why post-training is easier than we thought
56:01 RL environments and what our benchmarks actually test
1:01:11 Artificial life: LLM agents as new organisms
1:07:20 What's overlooked in AI research right now
1:08:36 Why stay in academia, and doing science in the age of Opus

Music:

"Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.

About: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.

The Information Bottleneck

Why All Models Learn the Same Thing with Phillip Isola (MIT)

Discussion about this episode

Ready for more?