June 28, 2026

AI for Science with Qichao Hu (Molecular Universe / SES AI)

Show Notes

Most AI-for-science companies are selling shovels. Qichao Hu wants the gold.

In this episode, we talk with Qichao, the founder and CEO of Molecular Universe, the AI-for-science platform that grew out of SES AI, a high-energy-density battery developer he's run for fourteen years. His core distinction is that companies from the AI world build tools, such as foundation models that predict properties, while companies from the science world care about the final product, such as the new battery or material that actually ships. Molecular Universe sits firmly on the science side, and the difference shows up everywhere from what they publish to what they refuse to.

We get into the actual workflow of materials discovery and where AI compresses it. A single trial in a traditional lab can take a year with maybe a 40% success rate; the goal is to run a thousand candidates in parallel and turn that year into a week. Qichao walks through improving low-temperature fast-charging for EV batteries: from hypothesis generation through molecule-, material-, and device-level property prediction, down to autonomous labs that synthesize and test the top candidates without a human touching a pipette.

The hardest problem, it turns out, isn't predicting molecular properties or measuring device performance, but it's the black box connecting the two. In batteries, that's the solid-electrolyte interface, which the field has been hand-waving about since the seventies. And the thing standing in the way of cracking it isn't a clever training trick but data: companies sitting on twenty years of records are finding it too messy, incomplete, and poorly labeled to train on, and are having to start collecting from scratch with new protocols and robots.

Timeline

00:13 — Intro and welcome;
01:19 — Shovel vs. gold
05:18 — Why the world's smartest scientist doesn't automatically give you a better battery
07:25 — The discovery workflow
09:37 — Exploration vs. exploitation
11:54 — Safety and filtering: screening novel molecules against banned and toxic-substance lists
17:55 — How hypotheses get generated, and where frontier LLMs help
20:29 — From hypothesis to ~400 formulations: property prediction, ranking, and handing off to autonomous labs
26:37 — "A foundation model for everything" — and the black box between molecular properties and device performance
30:01 — World models and physics
33:09 — The great unknown in batteries
37:08 — Simulation vs. reality: calibrating massive simulated datasets with a sliver of experimental data
41:47 — Lab robotics: how fast the hardware has caught up, and what a floor of autonomous labs looks like
43:50 — The real bottlenecks
50:21 — Pre-training from scratch vs. post-training LLMs, and why training tricks haven't reduced the need for good data
52:42 — Evaluation
55:42 — Publish the B+ model, keep the A model
58:05 — Five years out
1:00:37 — Closing thoughts and wrap

Music:

"Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.

About: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.