After Math Falls, What's Next? with Julia Kempe (NYU/Meta)


Julia Kempe on Why Math Will Fall Next, Superhuman Provers, and the Return of the Renaissance Researcher
In this episode, we sit down with Julia Kempe, a Professor at NYU's Center for Data Science and researcher at Meta FAIR's Foundations of Reasoning team, for a wide-ranging conversation on the future of AI research.
We dig into why verifiable domains like mathematics may be on track to "fall" the way Go did. With formal verification through Lean and the Mathlib infrastructure, LLM agents can now generate and check proofs at scale, and Julia makes the case that a new industry of automated mathematical discovery is closer than most mathematicians believe. We explore why Erdős problems are already falling, what's still missing for harder fields like analysis and physics, and how synthetic data, curation, and verification fit together.
From there we get into the energy and scaling limits of frontier models, the case for academic research that big labs can't pursue, how to advise PhD students when Claude can already do their first-year work, the rise of AI safety and security as research priorities, and Julia's optimistic argument that AI tools are bringing back the Renaissance generalist - the researcher who can finally work fluently across math, biology, and beyond.
Timeline
- 00:00 — Introductions
- 01:00 — Defining reasoning and verifiable domains
- 04:00 — Lean, Mathlib, and the formalization of mathematics
- 10:00 — Constructive proofs, Erdős problems, and the new wave of "AI mathematicians"
- 14:00 — Will math be "solved"? Art, photography, and the changing nature of creative work
- 18:00 — Why physics is harder than math
- 22:00 — Moravec's paradox, evolution, and why robotics lags behind language
- 27:00 — The Renaissance is back: generalist researchers in the age of AI
- 29:00 — Advising students: math, programming, and what core education still matters
- 32:00 — Teaching and assessment when GPT can do the homework
- 35:00 — Anti-AI backlash, energy costs, and the security threat
- 40:00 — Scaling vs. efficiency
- 42:00 — Model collapse, synthetic data, and what's left to squeeze from the internet
- 44:00 — What's exciting next: AI for science, safety, robotics, memory, and planning
- 47:00 — Annotation costs as a proxy
- 50:00 — Superhuman models and what security even means against them
- 52:00 — AlphaGo as precedent for verifiable superhuman performance
- 54:00 — Hallucination, the Mirage paper, and whether these are solvable problems
- 56:00 — Why coding isn't fully solved yet
- 58:00 — Agent security, prompt injection, and the Wild West of deployed agents
- 1:01:00 — Regulation: what's needed and what's possible
- 1:04:00 — Advice for PhD students and what research academia should pursue
- 1:09:00 — Startup opportunities: robotics, security, and AI for finance
- 1:12:00 — Closing thoughts: use the tools, and build grassroots AI for good
Music:
- "Kid Kodi" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.
- "Palms Down" - Blue Dot Sessions - via Free Music Archive - CC BY-NC 4.0.
- Changes: trimmed
About: The Information Bottleneck is hosted by Ravid Shwartz-Ziv and Allen Roush, featuring in-depth conversations with leading AI researchers about the ideas shaping the future of machine learning.
Ravid Shwartz-Ziv: Hey everyone and welcome back to another episode of the information battle night and I'm reviewed and my co-host Allen hey Alan, how are you?
Allen Roush: Hi, Ravid. I'm great. And hi, Julia. How are you?
Julia Kempe: Hello, it's a real pleasure to be here.
Ravid Shwartz-Ziv: pleasure that you came. So today we have a really, really great guest, Julia Kempe She is â a professor the â NYU Center of Data Science at NYU and also a researcher at Meta. So thanks again, Julia.
Julia Kempe: pleasure
Ravid Shwartz-Ziv: And so today we are going to talk about a lot of different â topics. You did so many things that related to so many different â domains, AI, math, a lot of different research ideas. And let's start with maybe with reasoning and why you think maybe let's start with. Defining reasoning and then we'll go why it's actually helpful
Julia Kempe: Thank you for the question, Rabit. And it's true. You immediately put me on the spot because actually we all do reasoning research without wanting to really nail it down and define it. And I believe, in fact, the team that I am in at Elidat Meta is called Foundations of Reasoning, which is maybe puts us even in a tighter spot for explanation. But for us, pragmatically, what we defined it as or what we are working on is basically large language models and the last phase of these large language models, the last piece of their training, the post training, where we teach them to, you know, be good at math, be good at certain questions, have certain behaviors. And we usually use this, do this with what's called reinforcement learning. And we call this reasoning. Reasoning is simply because then it starts being able to reason, being able to solve math. And I have to say my recent research, maybe over the last nearly year or at least last few months, the larger parts of the year has been in the domain of verifiable domains like math, where you can really reason and you can even assert that you have reason because it's either correct or it's wrong. And yes.
Ravid Shwartz-Ziv: So, wait, let's start. There are a lot of things here, right? So, first of all, we said that reasoning is something very amorphic, basically because these are language models, so we can look on them, like, they are reason about it and thinking about it and all these things. But now you have verifiers, right? So verifiers, it's a big word for just making sure that the output is correct, right? Okay.
Julia Kempe: Yes, so for us we are talking about verifiable domains here and mathematics is such a domain because you know it is perhaps the most verifiable domain because we can you know we have certain axiom we can posit things we can develop a proof and we can in principle you know check this proof â and so it is by definition verifiable. And course, other domains are also somewhat verifiable. You can think of physics and testing it on a test and a multiple choice test. And then of course, again, we can verify the language model picked the right thing and then it becomes more flu. Right. What does it mean to verify and where does where does the subjective set in? Right. But we are basically our research. And that's what we say. Reasoning for us technically is the domain where we have some level of verification at the moment. This is where we do research and we're trying to understand how far can we push language models in this domain, in particular in math research, for instance.
Allen Roush: So I saw this thread recently or â discussion about some kind of blog post on Hacker News discussing somebody talking about, you know, I ran a or made a formal proof with Lean for my program and then it had bugs anyway. And then you go in and read the blog post and find out it was because the Lean proof wasn't looking for this particular class of errors that happened because it was a very, very like specialized proof. So I guess my question is, what do we do about the abundance of like, like the just extremely large size of the possible ways that programs can be approached or even attacked, so to speak, or even with verification available? I guess what I'm trying to say is how do we handle unknown unknowns?
Julia Kempe: â Okay, so you are actually moving me away from the math because I think software verification is a slightly different beast about which I'm â actually less competent. And I believe that it's true that when you start enumerating the errors you want to look for and then you look for them and then there are many more errors, then you get into trouble. This is not what I meant when I spoke about verifiable domains in math in particular, because in math, â lean has this property, which is beautiful for our research, let's put it this way. that once you formalize the statement you want to prove, let's say Fermat's Last Theorem, you can formalize it in this Lean language, which has its own kind of qualified ways of stating things. And once you've done that, and some mathematician has looked at it and said, okay, this is the Lean translation of Fermat's Last Theorem. then if you give me a proof and I formalize it, right, which again there is a way to formalize it, â which is not trivial, but you know you can do it in principle, you can have formal proofs, then you just press a button literally and say my dear lean built this and verify it and that is really then error proof because you just need to basically check that one thing follows from the next to simplify it, to follow from the next and so on and so forth until it goes to these axioms that they have codified. A lot of mathematicians have done amazing work in setting up the infrastructure which is called MATLIB, which at this point already contains something like over 2 million lines of proofs basically and statements, where they go ahead and try to formalize more and more different areas of math. So it goes back to that and then we're sure. Right. So this is fantastic because we can now think. So it's fantastic. I think from an intellectual viewpoint, I'm not so sure whether it's fantastic for the job prospects of a mathematician, but in terms of finding the truth, which, you know, in mathematics, you can find the truth because in the end, the proof is right or wrong, true or false. So if we have this in mind as our intellectual pursuit, right, we could imagine a machine, Alan, to your point, that goes off various ways, proposing various proofs. errors in this case, but proposing we have an open problem, Navier-Stokes or the Riemann hypothesis or something, right? And we want to know true or false. The thing, the LLM, which is, it might hallucinate or whatever, but it can go and produce various proof attempts. And because we have them lean and we can formalize, we can really check on them. So, â What that means is that once we scale up proving with an LLM, I mean, if we just have enough Claude agents sitting there doing their proofs, as you know, now with Claude code, you can have these agents and you can run 30,000 agents and they can all start going searching for proofs. And then we can formalize and we can verify and a whole new industry can be put in place for mathematics. Right. And this of course isn't quite here yet, but I think it is way closer than maybe some of our mathematics colleagues believe. Let's put it this way.
Ravid Shwartz-Ziv: So maybe you can share with us what you think are the parts that are actually there and you are sold and what parts are still missing.
Julia Kempe: So in the area of mass, We have this long effort from a fantastic effort from this lean and Matlab community to formalize all of math. But some mathematics is much easier to be formalized than other. instance, defining what a prime number is, can even, you know, it's not so hard. You know, it's either divisible, it's only divisible by itself and by one. It doesn't take so long to explain it, right? And kids learn it very early in school. But then there are areas like analysis or partial differential equations, which even the reason reasoning in books or in papers doesn't go like very cleanly axiom by axiom but it's a bit reasoning by analogy and a little bit of it's not very fully precise you might say right in and this is much harder to follow us and
Ravid Shwartz-Ziv: Like physics.
Julia Kempe: Right, and in fact that particular branch of the analysts rightly thought it would either never be possible or it would take forever. as we see now with these AI agents, it's actually... it's possible, right? And it's getting there and the edifice is growing. It might not be as pretty as Matlib was or as, you know, as how can I say, as principled as Matlib was, but it's growing and very soon we'll have this body of formalized math and then every new paper is say a hundred dollars in formalization costs away, see, right? So you write a paper, you launch your Claude or whoever, right, your provers and you verify. And as you know, also many math papers carry like any field of science, everything, not humans don't catch everything. But this would be a beautiful world, right, where we can just have a peer review outsourced basically to LLMs in this area.
Allen Roush: So I'm curious about. specific branches of math too, because you've already given a little bit of analysis about it. â What do you think about, â like, constructive mathematics? â So people that say, okay, I don't just want the proof of existence, I want the ability to construct something, which is, or I believe they call that in mathematical intuitionism, I believe is the term for it, which has its own, it's kind of a, you know, maybe a more niche community, but it's like demanding something harder in a way, right? And my understanding, maybe, I'm not sure if these lean proofs â are ever constructive or not. I'm not really an expert here, so I guess I'm curious, what are your thoughts on that question?
Julia Kempe: I think the lean proof basically mirrors whatever the proof proof is, mean, whatever the informal proof is. I think we don't, I mean, as far as lean is concerned, you know, and that's being done, right? There are papers with proofs, constructive or non-constructive, whatever they are, and we just formalize them so that we can verify them automatically. That's the whole, you know, it doesn't matter. I mean, this discussion is orthogonal to what we're discussing here. But I mean, in terms of areas of math that are now already closer, Everybody is now attacking the Erdos problems that have been overlooked over the last few years. Erdos was, as you all know, a famous mathematician who was more like a number theory, combinatorics, these kind of problems. And these are easier to formalize because they're very precise, right? â I don't know. There's whole...
Ravid Shwartz-Ziv: How many? How many problems are there? Open problems?
Julia Kempe: of Erdos, like I don't know if Erdos posed them or if they are attributed to Erdos. And there's a whole database of them where you can look and they're unproven. And sometimes they're unproven because nobody cared to prove them, frankly. And, you know, these are now, you know, the, what can I say, the most field of delectation of AI mathematicians because they are now falling.
Ravid Shwartz-Ziv: But if we prove them, like if we LLM prove them and formalize it with Lin, do you think besides just proving them, do think we can get something out of it, we can learn new things?
Julia Kempe: Yeah, so that's exactly, I mean, this is a question you could have asked of mathematics even 500 years ago, whatever, right? What do we learn from mathematics? And that's maybe a question we can discuss or not discuss because mathematics, of course, has been very tied to the sciences, right? mean, the general rule. special relativity, even lion and general relativity. So the math came in handy to then formulate physical laws. But the math, think the world of mathematics has taken a life of its own, right? I mean there are mathematicians who are visionary and they think here's a program this is a sequence of things we should be proving to understand certain things to clarify the order of objects. There are also mathematicians who just really â need certain mathematics to be able to describe turbulence and flow and so on and so forth. You didn't ask me why are we doing art. In some sense mathematics is partially done as a form of art but I think it is also a form of You know, a very useful tool for the sciences. And of course, this form verifiable, the reasoning, why are we so hooked up on math? I think there is some belief, you know, that math has the, it's very precise and clean in deduction. And that this, when we teach the language model to be like this, right, with this reasoning post-training research, not with leanness, but just in these â post post-training RL loops, there's hope. And there's also some evidence that they also become a bit more logical in other fields, right? That this kind of transfers these capacities to, mean, why are we teaching kids math in school, right? I presume it's also because we hope, I mean, it trains the brain, but we also hope that the logical reasoning is useful in other areas,
Ravid Shwartz-Ziv: Yeah, I think it's interesting because in the past, people, I don't know, when the first generative models, we thought, they're really good in generate images, very abstract images, and then people thought, yeah, maybe art is something that these machines will do, right? And then now we have, we understand that probably human art will stay here and like... the more verified fields, right, like, will be solved much â sooner. So do you think this is something, like, do you think, like, is something that, like, if it's very verifiable, verifiers, then it will be solved? Or like, this is kind of like more like art that it will be like, I don't know, like vision, you need vision in order to invent new problems and just to check them with, with elements?
Julia Kempe: I'm. So this is broad question. have at least two things to say about it. Also about art in connection with this. Because what does it mean to solve math? â Of course, even now we all, I mean, I assume you also, we all work with Claude and delegate things that in the past. we did ourselves or a graduate student did and it's like proving certain lemmas or even doing the figures, whatever it is, right? So these little tasks that need solved, like lemmas, these I think will be done by an LLM. But â then the bigger question is, and that's also true for math, is we need to define what we care about and what's interesting. Yes, it can do a lot of weird incremental stuff that nobody cares about. And then who gives meaning to this, right? And then is that a human or is it a machine? Can we teach a machine? to define what is interesting and what is relevant and what is beautiful and what is useful. This is yet another question, right? So will math be solved? mean, once we decide what's useful and we lay out a program like Poincaré did, one of the mathematicians, probably these little tasks then on the route to it could be outsourced to an LLM and very soon, I'm sure, very, soon. But then the bigger question remains, math is a bit open, right? It's like art. There is no, it's not like a pace path from A to B and you go and you're done, right? So, and that piece... I think we'll stay with humans for a while. And also then it's solving problems, right? We will have AI that can solve certain pieces, but then we have the big problems to attack. Climate prediction, â conflict resolution, all these things, right? We can then put them to use and, know, it's AI for science and so on. And then for art, I mean, you are also saying LLMs are good at art. I read recently a blog post somewhere. Unfortunately, I don't remember the source, but it made this beautiful analogy to when photography came up. So in the past you remember all these painters they were with a court with a noble man and they drew all these portraits and the art was to be as realistic as possible very often right and it was a whole art and then photography came along and they were all saying â my god what now i mean everything we will be out of a job and think about what happened then 100 150 years ago all these interesting new art forms that had let's say painting i'm thinking of now which diverts from reality right because that could be captured in a photo I mean we don't need to exactly have a portrait of whatever Queen Victoria, but then people gave their own meaning to it and we had abstract art and had Matisse and Picasso and all these things. So art changed, right? With photography. And I think our way of doing science and so on will also change because we have these rational pieces. These are done or will be done very soon. And it's now for us to develop this new way of utilizing this, right?
Ravid Shwartz-Ziv: Do you think it's also true for physics?
Julia Kempe: I think physics is already harder to attack because there's more maybe ideation in physics and a little more â of discovery of I mean, physics also, â it's couple of phenomena I wanted to say. It's not as clear cut as in math, know, proof the Riemann hypothesis. And physics, what is physics about? It's about explaining what we observe in nature, right? And also even, you know, analyzing the data. And I think, of course, in data-rich physics fields like cosmology, for instance, and astrophysics, there's already a lot of use of AI just to do pattern analysis, right? And trying to, you â do inference and do doing simulation-based inference where you try to understand what are the cosmological constants from this thing. And this has been already the last 20 years or even more, would say. AI has been extremely helpful. â And now that there's physics that's related to math, right, where we try to write a new quantum field theory for something. So all these aspects will be filled by LLMs, but maybe less stress forward, straightforward than for math.
Ravid Shwartz-Ziv: But we have here two things, One is physics involved in the real world, right? You need to collect data in the real world, you need to have some experiments, and it looks, at least for now, that â models are not so good in that, right? We don't have good robots, or good enough â robots. And the other thing is now to make a good simulator that you can simulate all the things and make it verifiable. So do you think this is why some domains, like in this case, like physics, â still need to do... Humans need to be in the loop and we need to work really hard in order to... to generate new data, to validate the experiments and the results and the analysis.
Julia Kempe: I'm sure of it and maybe â an even more pointed field, is similar in some sense is biology and medicine, where of course AI is being used massively now â to make predictions and also to mine the data, like clinical trials and also for drug development. And one of the uses of AI and that also could be for physics is to Say one experiments in the sense that it tries to figure out what is the most useful experiment to do next so that I gather the maximum information with this data that I would get in order to have the next insight, right? So bang for the buck optimization and that's also where I already comes in and I know there are companies drug drug development companies or even just Research medical research companies and physics is the same. I mean, it's just in medicine It's maybe because you need and then of course we don't have the robots, which is a whole different about AI that's missing, right? I mean, yes, we have language models and they are actually stunningly amazing, â but they don't do the experiments and robotics is lagging behind. And I think that might be also, I personally think, and I want to pivot my own research â more towards AI for science and also the AI that doesn't work so well yet, noisy data, â uncertainty, and so on and so forth. â
Allen Roush: And what you were just talking about, about the relative lack of progress with robotics, I believe it's called Moravix paradox, right? Which is related to how human babies are relying on millions or however many long years of evolution. And we didn't evolve to play chess, which is like a kind of post-hoc. â explanation for why it's relatively easy to make models, you know, do a type of reasoning even 20 years before the generative AI revolution that we see today. â Do you agree with this? Like, do you think that â this kind of assessment is correct in that we might for, you know, a while longer continue to have like structural slowdown or lagging behind of robotics versus brain types?
Julia Kempe: I can not speculate as well. It's definitely true that evolution took a long long time to produce an animal that moves and you know grasps and knows where to jump and all these things and then language on top of it is just a tiny right and humans and language is like nothing. I mean on the evolutionary scale and then you could reason that okay because that piece didn't take so long for evolution it's the easiest to reproduce right and clearly that could be true but I mean it's it's a statement it's an analogy it's a beautiful statement but
Allen Roush: Yes.
Julia Kempe: I don't know, beyond an analogy, don't know what to say, right? And also language has other properties it's like. tokenizable, right, and is discrete and it's the noise level is limited, right? So once you have a sentence, you know, the continuations are not so many. I mean some, but you know, next word prediction. So this, there's all of lots of convenient stuff for machines here, right? Plus we have this huge corpus of language that was the internet that went into or is going into the training of the machines, right? So, so that was of course a lot of good, know, serendipity or, you know, a lot of circumstances that flow well. Now you're asking me why, you know, why are we so much behind with robotics? I think there is also a bit of a problem of data scarcity here, right? mean, relatively speaking, there is less, you know, language. We've used these trillions of corpus of training data with robotics. We don't, right? And we have this issue planning and latent space and modeling, which we haven't understood. And I think, I I love the title of your podcast, information bottleneck, because evolution, right? We are going through this bottleneck. Our brain is finite. And in fact, we know the fallacies of our brain, right? We can't even, that's this Ellen Lamble, recite me Harry Potter from beginning to end. And we just, you know, I haven't seen a human yet to recite more than. page two, whatever. mean, I've seen it, but you, have this bottleneck that was done with its inductive biases for a certain purpose, â in particular, right. To teach motion and so on and so forth. And somehow I don't quite see this in our last language models. don't see this information bottleneck that much. Yes. It's not like as many parameters as data. That's true, but, it's, it's a very different process from evolution. find in these land that these models.
Ravid Shwartz-Ziv: Yeah.
Allen Roush: I just want to quickly ask because historic, my understanding is like those old epic poems like the Iliad or whatever were the Homer, Homeric poems were supposed to have been passed down by oral tradition. I don't know how true that that actually is. But at one point, we were apparently memorizing book length works. And indeed, I seem to also recall, maybe it was Socrates or Plato lamenting about how us learning to write eroded. are kind of brains for the ability to do this and it's led to cognitive atrophy and he's like complaining. So I'm curious, do you think that like that kind of dynamic happens today with LLMs as well where like now we get cognitive atrophy?
Julia Kempe: I also know this very beautiful analogy and this lament indeed and then I think with the printing press there came yet another lament on you know that humans don't write anymore and â my god what will this do to the brain â I think you know it's a bit zero-sum in some sense hopefully it frees up our brain for you know you can take I you know we don't know we can speculate you can take the optimist or the pessimist view we can atrophy or hopefully it makes room for something else in our brains, right? And indeed, certain capacities get lost. mean, think of sports. We're doing sports simply because whatever, out of pleasure or duty to our bodies, our health. But we are not anymore in the position that we need to lug the mammoth from the woods to the cave or something, right? So â indeed, we have to start living with the tools and not to atrophy. And indeed, my colleagues start telling me that they... that they from time to time write something because they're afraid they forget how to write when, you know, GPT writes all their papers and stuff like that, right? Or formulations. Programmers tell me that they sometimes just go off and program a little bit because their brain will atrophy. I think we humans will learn to live with this and then do the sports, the mental sports ourselves. I think we know, I mean, at least some of us â will do this. So I'm a little less worried, but I'm very curious how this human will co-evolve with the machine. genes, right? And I think there's actually something really positive coming out. â I myself as you said, I'm somebody who likes to do many things. I get bored very fast. know, I did quantum computing, I did finance, I did all these things because I love a diversity of things and I always was lamenting internally that I don't live in the Renaissance because there you know this concept of a Renaissance I don't want to say men but it was called Renaissance. Man, Renaissance, woman and that was meant to be a person who could do kind of everything Leonardo da Vinci, right? He could do anything. He could faint and he could do anatomy and he could do some physics, right? And then they were said, and now nobody can do this anymore because we need to specialize so much, we need to do all this specialized stuff. And I find now that the Renaissance is back with the tools, because honestly, I don't need to prove the concentration inequalities every single time when I kind of feel what the truth is, right? When I write a paper or something, I can outsource. I want to learn biology. Fantastic. GPT explains the broad strokes. I have learned enough biology to kind of hopefully make some sense of it. And then I can connect topics, right? So think about it. It's you are thinking it as atrophy But I think there is also a different aspect that we now have some room in our brains to to do the Renaissance thing again, right?
Ravid Shwartz-Ziv: I think we have a title for the episode. Like the Renaissance is back is a great detail. But you think like what kind of like researchers we actually like need in this area, right? Like if you need to do a lot of things and a lot of like high level things, you need kind of like managers or like you need people that know to understand like what are the important details and what types of of researcher or like kind of influence you think you need.
Julia Kempe: Yeah, it's and I want to relate this. think Sam Altman was asked in some interview now with GPT and Quotes, Codex, Quotes so well, do you think we should teach our kids programming still? Right? Which is the same question in a different way, concretely. And he thought for a while and said, yeah, programming is still useful. And so what I observe in working with the Clauds and the GPTs and the many agents and all of them together is that it does help me to have had this education. Where you know I had to sit down and take integrals at some age whatever 15 and I had to do some programming and You know, I suffered through my sums and you know, I was an Olympiad kid So I did math Olympiad so I suffered through all this number three and this geometry or not suffered I actually loved it But in any case so I went through it all once at least at an early age in my case and then most of the time I already had some sort of intuition I actually did less and less of it because I kind of felt what the truth is because I've done it right and now I work with Claude and I kind to see, you know, it starts formalizing something in Lean, for instance, which we currently do, and I kind of feel there's something wrong because he's trying to do these inequalities, you know. So I have, developed an intuition in short, and how did I do it? I did it by doing the stuff myself. So I do believe actually I've thought about it a lot because I do have kids, right? And I'm wondering what should I tell them to not do or do? Not that they listen, but in any case, while they listen. And I came to the conclusion, yes, they should do some math and suffer through it or do the motions. Same for programming, but they don't need to do Python and Java and C++ and, you know, Julia language and this and that. I believe that's not necessary anymore. but they should do one in order to understand the principles, right? And this leads me back to say, â maybe our school education needs to change somewhat, but the fact that we teach all these subjects from the base up in the beginning â is a good one, I think. And I think this will... This will be something we'll need when working with the tools because otherwise they'll fool us. particularly with the agents we work with for formalizing, they love to take shortcuts. say, I'm done. â and I say, did you prove telegrammed? And he says, no, no, but I declared it to be an axiom. And I say, well, you know what, go and prove it. So these things. â
Ravid Shwartz-Ziv: you
Julia Kempe: We will need this core education. hope at least this is my hope and this is what I think and this is what I tell my kids. But then we need it only once, some sort of principles of programming, principles of math, all these things, biology. And I do believe now the education should be even, you know, we should go for breadth and depth, of course, but breadth, because as I said, the Renaissance is back. And beyond that, of course, it's all work in progress. I also don't have the answers, right? I do see a lot of students, young undergraduates. convenient, know, you're tired, you want to go sleep, why do I do the homework, why sit through one, you know, all-nighter, you know, stuff, it's through GPT and then we do have these curves, so all the homeworks are perfect and then the exam happens and it's like, you know, lots of failures, right? So we all have to also develop a certain culture of understanding why I need to do something even though GPT can do it, like sports, right? And that will take a certain â culture shift, I think.
Ravid Shwartz-Ziv: So how you tried in the university? What is your take of â using homework exercises as a test? Are you doing exams? What is your take about how to teach them in practice good foundations?
Julia Kempe: Yeah, so I haven't thought about it whole lot because I was actually on leave for a while, so I haven't taught in the last few years, I must admit. But thinking this through, when I went through university, was a while ago and was still â oral exams, it still existed at the time. was kind of small enough that, you know, they don't scale, of course, right? You can't do oral exams to a thousand students if you're one professor. But â that was something individualized there that helped, right? People knew what they knew. mean, the teachers knew what the students knew, et cetera. And there was a more personalized component because we were smaller groups. Now, on the other hand, we now have AI that could take the role of the personalization in teaching â the kids at their level, right? Because that's where you learn the most is when the teacher grasps your level and just challenges you at your level, a bit higher, right? And not too high, not too low. And I think a lot of time in the past was wasted on this. So. â That said, so we have AI to help us, but then how do we assess students? I think first we can again leverage AI to do exams on a personalized level. And we should not insist, as I said, on redundant stuff like various, you know, Python. mean, you know, some curricula in universities and computer science contain a lot of stuff that I now find a bit redundant. But yeah, I'm not sure I answered your question now. How do we convince them? These are two different things, I think. And of course we should definitely not deny the tools, are there. mean, you cannot say, you know, pretend that there is no GBT. Pretend you don't have the free version of Gemini. Of course not, right? Critical thinking will be much more important now because as I said, the tools can fool you quite a bit. And so we need to teach that to them. Maybe we should have some sort of...
Ravid Shwartz-Ziv: Nuh-uh.
Julia Kempe: You know, mock things where GPT comes out most convincingly and then we say, okay, now find what went wrong, stuff like that. Much more of this, right? Yeah.
Ravid Shwartz-Ziv: You
Allen Roush: What do we do about political and populist backlash against AI and even within academia too? So an example, right, Sam Altman just had somebody through like was it, Molotov cocktails at his house. And I think there was another person that just tried to attack him recently or like attack his house or something like that. There's a lot of journalism that's been spilled about this strong anti-AI sentiments that have been growing in certain parts of the country where I'm from Portland. article that got a lot of traction titled Everyone in Seattle Hates AI. have generative AI, Gen. AI is my license plate and I get flipped off a lot on the road for non-driving related things. So I have my own personal anecdotes on this. But, and now there's even politicians that are becoming very anti-AI like Bernie Sanders, for example, has started to espouse especially anti like data center development kind of rhetoric. So what I'm worried about is, you also say one final thing, is that in a lot of the academy, maybe not the computer science department, but a lot of the social science departments, many of them, you know, in some cases are, you know, a bit to the left of mal, meaning, you know, that if we're seeing people in those kind of communities become anti AI, I'm very worried about, you know, all sorts of very chaotic and maybe dangerous dynamics that could lead to, and I've heard even arguments, and I've also finally heard that enrollment at many universities across the country is kind of down anyway, but I've heard some students don't want to take AI because of fear of this kind of backlash or stigmatization on campus. Like, are you seeing any of this or is all this like fantasy to you or what's your take on this?
Julia Kempe: I mean, I want to say two things here. Again, there many more things to say. And first, I want to actually continue with what you've been saying. And then I will take the other side. Right. So you are absolutely right. And people are right to worry or to, you know, the sentiments come for a reason. And one problem, of course, with the large language models as we have them now. And their proliferation is the energy problem. That's not denied, right? Data centers, more data centers, more lakes burned, more... â The energy problem is huge. And I think one area of AI research should go towards more energy efficient and more data efficient things. And again, the way research played out with taking the whole internet, stuffing it in the largest model possible, hoping for the best, led to these large models. I don't think it's an imperative. I'm hoping there are other solutions. I mean, I'm an example and you are one, right? You have a brain, I have a brain, Ravita has a brain, none of us burns even, you know, we there are these computations how much energy we burn. of course not comparable. So definitely there is, we must be conscious of the problems that exist. One is the energy and second of course the security and the ability of AI nowadays it's become so powerful that it can manipulate, it can hack, it can you know it's a real security threat. So I you know I want to harp on on what you said in order to emphasize it. Okay but then of course I also think the solution is not to deny AI. And it reminds me a little bit of these, I think it was the Lutist in history that went and destroyed the machines because the machines were taking their jobs and also because, you know... For similar reasons, right? So I don't think that can be the solution and if people think it's a solution there will be other people who will come abuse it So, you know who you'll use it. Yeah, so this is not the solution â and the solution would be indeed I've seen lots of interesting initiatives starting to happen because I also fear there's another aspect I fear is auto auto return, know AI would allow authoritarian regimes to become even more authoritarian. We all know how that works and â might also if you know if unconstrained it will make the gap between the have and the have nots even bigger. So all these problems exist but I've seen beautiful initiatives and I would encourage people to look at it. People like to volunteer. I mean there's a lot of goodwill also in tech and of course elsewhere and for instance we could empower nonprofits with AI. Just like the profits so to speak are empowered by AI there's a lot of volunteering a tech person can do for a nonprofit to become 100 times more efficient just like the others become 100 times more efficient or a thousand times more efficient, right? So I think it's time also for grassroots AI in some sense, because it's like a powerful thing and in the hands of whoever has this power it can be done. And I encourage really the people who say I hate AI to maybe on the contrary to start thinking how can AI help me because you don't hate AI I presume you hate the consequences of what AI does so you know the thinking now should be how can I use AI â to prevent what I fear AI will do right I mean again I'm an optimist so maybe I'm living in some lala land but that's my thinking on this
Ravid Shwartz-Ziv: And what do think about the future? like, we talked about efficiency and data centers and energy efficiency. do you think the future will be continued, the recent or like the current trend of like scaling everything and we just need more and more power, more more data centers and let's hope that we will have enough capital to build all these data centers or do you think like we will go to more like, let's say, separate â models, more smaller models that will try to do different sub-tasks.
Julia Kempe: Yeah, so I mean, I, as my guess is good as yours, I do believe important research should be done in this, you know, scaling down, not scale it down in the sense that, but develop methods that. are energy efficient, smaller models that achieve the same thing as bigger models, for instance, or different way to process models. This kind of research needs to be done. And unfortunately, right now, the big tech companies and I'm thinking, you know, OpenAI, Antropiq, Google, Meta, there's a bit of a race. So currently there's a race and scale thinking still, right? And that doesn't allow because it's such a rush. There isn't time to do this kind of research that would lead to more energy efficient things. me actually a lot of hope for academia because by definition academics don't sit on half a million dbus. They don't. And so just from that constraint, limited constraint way of doing research, I have hopes that good ideas, they are emerging and they will continue emerging. And I really believe also in this open source and open publishing world where these ideas flow freely, which is part of the academic culture. So I, again, I'm optimistic, but I also think, I mean, there is a limit to scaling. mean, there's only so much energy and so much data centers we can build. So it's interesting. You know, two years ago, I was working on a different kind of things that does a scale, we called, which was called, we didn't call it the model collapse. This was what, what if we run out of data, right? What if there's no more data? mean, the internet, what is, we started already in once, right? So what happens if we now don't have more data, more, more text and, know, it's. it's not very hard to show that if you don't have more data and if the AI produces the data and then you stuff it back in the AI, it will kind of degrade and collapse. So that's it. Sorry.
Ravid Shwartz-Ziv: But we don't see it in practice. I know that there are also theoretical work of urine gall work, for example, that show that if you feed it again and again, it will collapse. But in practice, we don't see it. We see that synthetic data works really good most of the times.
Julia Kempe: Fasson, I agree, think about synthetic data. not all. So I think nobody would contest perhaps that if you just indiscriminately generate data, stuff it back in, â this will collapse and the model will get dumber. You in collapse, but if you take synthetic data, so data generated by the model, and then you verify it or filter it in some smart way, then it tends to help. And also we haven't actually really exploited all the data on the internet yet, because it is when it's uncurated, there's so much garbage that just data curating, there's still the do's to squeeze, right? But that leads me again, the progress comes now in all these fields, code and variable stuff where we can automatically check that it's good data and so on and so forth. But what about this other data that is not verifiable? I think there we still, the judgment is out. I don't see that model collapse wouldn't happen right there. â And that's again, yeah. So that's other research data scarcity features because I mean, obviously a human hasn't seen. As Janne Koon likes to say, we learn to drive in 10 hours of driving and a model needs, I don't know how much data in order to drive. So clearly we haven't solved something and clearly there's a solution because we can do it. So I think the research will go that way and we'll figure it out. Maybe not next year, but yeah.
Ravid Shwartz-Ziv: And what do you think like the most exciting directions for the future in AI? What do you think like is it continual learning, memory, robotics, like non? Well, it's like for research and also like for production. It's not clear these days like what is the line between research and products, right, in AI, but like...
Julia Kempe: for research.
Ravid Shwartz-Ziv: What do you think are the unsolved exciting problems?
Julia Kempe: I mean, in AI, â again, as I said, I feel like in the language model field, it's become very much a race, a scale, and an engineering. mean, engineering is also research, but a little more specific, right? How do I? changed a little bit the architecture and this and that, then, you know, manage my context better, you know, produce hierarchical skills or whatever. So that research is definitely there. But what I find now exciting because I do, because working with AI, with the Claude's and the GPT's shows me how far they've already gone. And I think now we can, we definitely should start attacking the problems we have, â which, you know, humanity isn't exactly in a great shape, climate. â All these things we should now get ourselves in order to do AI for science, for biology, AI for medicine. I think it's just in the beginning, right, â in my mind. There's so much â potential to... to find more insights, know, cancer. I mean, I think these questions can now really be solved or be solved better or more. I mean, we can make more progress in these questions. So that's the research with AI. OK, that's one thing. But your question is, what can AI not do yet where we should as AI researchers focus? Is that kind of your question? I mean, there are things like that don't work well. Robotics aside, which doesn't work at all in my mind, â And, you know, be still not having somebody putting my dishes in the dishwasher is a real problem, right? â So I think robotics and, of course there is a lot of research in robotics needs to be done. Something that is very important now, I think is security and AI safety. mean, that people said that many times, but now I think we are at a stage. mean, you've seen around mitos, a tropics model, how, how AI safety has become really really important and I anticipate also many startups will now already springing up that will come in this area. â But you know, I think memory of course, right, planning, all these things aren't so great yet with LLMs and they all become important. I mean, they are important research topics at this point.
Ravid Shwartz-Ziv: For security, you think that like this is not like, do you think like we have a clear path actually how to solve it because like then like right you have a model that you don't really control. So yeah, you can try to make it more structured and things like that. But do you think there is like a real way to, for a good progress there?
Allen Roush: Thank
Julia Kempe: Are you asking me, is it inevitable that the models will come and attack us or in hands of evil human will work out?
Ravid Shwartz-Ziv: No. No, I mean, no, no, I mean like, okay, so like we have more powerful model that's fine. They will probably find some vulnerabilities, like some problems in our code and we will try to use them. But do you think this is, there is something like more fundamental to work on besides like, yeah, let's try to prevent the models to do it or like let's try to prevent from human that using his model. to do it? Do you think we there are something more than that?
Julia Kempe: So I'm not sure I will answer your question, but I was thinking about this because at the moment we kind of still think of our models as being like good human or like the sum of many human experts, right? I mean, in the end at math, it's kind of like the best mathematician at various problems. It's like the cumulative sum of the best specialists, right? And I recently saw a talk by somebody from Entropic â where they were describing the progress in their models over the years, over the last two, three years. And they described it with a proxy and the proxy was how much did they pay their annotators? Okay, data annotators. Okay. And so I think two, three years ago, annotators were being paid ridiculously low sums because all they had to do is, okay, this is better than this. And that's it. This is a nice cat. This is an ugly cat. Whatever. This cat has three eyes, bad, right? Then, you know, whatever, a year ago, they already had to pay more because there was more specialist knowledge needed. And now they said in the last few months, six, seven, they really only need not just PhD level, forget it, even that is out, but they only need practically Nobel Prize winners. I'm exaggerating, but the specialists of the specialists. So the annotation cost goes up. This is a proxy for them for how much knowledge the machine still needs to learn something. And basically we're at the level where there is no more, I mean, maybe in a little bit, no more human who can annotate anything because we have reached that level. We've squeezed out the human expertise. And for me now there is the question, what now? Will the model or will it not turn
Ravid Shwartz-Ziv: you
Julia Kempe: sent the human right? think the problems will be different if we have this superhuman model as compared to having the best of all humans, which there is still a work to be done even for the best of all humans. Let's not kid ourselves right when they start hallucinating. We all know it's unpleasant, but still it's a fundamental question because I think security against an army of superhuman bots is a very different beast â than security against the best human right? We understand kind of what the the best human is, but I think even just conceptually, what is security? How do we protect humanity against bots that are, and I can't even imagine what it means to be superhuman, but in some sense superhuman â is an interesting question for me. And this we need to start thinking about, right? Because we don't want to be, you know, there is a lot of people â having scenarios, doomsday scenarios when the bots become, you know.
Ravid Shwartz-Ziv: Yeah.
Julia Kempe: we say, know, very, very smart and then we're doomed, right? There is a lot of these people and maybe, you we need to think how can a weak human create bots that are superhuman but won't eliminate humanity.
Ravid Shwartz-Ziv: Yeah, I feel that the problem in these areas is that it's not clear because all the models are closed source and we don't have access to them. for Entropic, for example, they didn't even give access to their API for most of the people. It's not clear what is a PR and what is an actual progress. if they're actually afraid that the model is so good that it will take over the It's just a PR thing that like yeah, we Give us money because we have a great model, right? and and I don't know probably like the the solution will be some combination of a regulation like a government regulation and and and self regulation and but till then till like we Clear out how to do it. I think it will be very â confusing how to evaluate the current states.
Julia Kempe: gonna go. But I mean, if you are questioning whether their model is as good as they claim, right? And maybe it is, or maybe it isn't. But we have examples, for instance, think about AlphaGo, right? That's like 10 years ago. AlphaGo, right? You all remember that it was trained with reinforcement learning, Google DeepMind, and then came out this model and they let it loose to the best champions. And, you know, this was a big aha moment for humankind when it won, right? The model that is AlphaGo. And nobody contested this anymore, right? And then we know how it works because it learns to play with itself, but because the rules of Go are so confined, you don't need a human anymore. You don't need the best human expert to tell you if you won or you lost. And this way, the model managed to get superhuman in Go playing. We know this, so we acknowledge this, right? We know the thing is better than no human has a Same for chess, right? So we know in principle, these confined domains, it is possible to have a model that's better than any human. You know, clearly, I mean, nobody doubts.
Ravid Shwartz-Ziv: Yeah. Yeah.
Julia Kempe: So whether the latest on-topic model is such a case or not, cannot say, but we do have precedence in restricted domains, so I wouldn't exclude the hypothesis that eventually the model will have capacities that we can verify them.
Ravid Shwartz-Ziv: But do you think like all the autonomous research and self-improvement models, do you think this is something that we will... we are already there or like we will see in the near future because it's not clear like maybe you tell me like do you think like all the problems, I be hallucination, there is the the the Mirage paper, did you hear about it? Like we talked about it like last time that they gave a a a a a VLM, they gave it like the question of some data set, vision data set, but they didn't provide the actual data. They just gave it the prompt, the question. â And apparently the model didn't say anything about, that, â no, you didn't provide me the video, right? It just started to hallucinate some random... â random frames and random details and objects and persons and the crazy thing by the way actually it improves like the the model performance it was much better than random it was actually like almost around like 80 percent of the of the performance when you give it the full data and so so do you think like we can we can actually like these problems are solvable or do you think there is something like more fundamental that we are
Julia Kempe: I what you're saying. So I actually do think they're probably eventually again, it's domain by domain, I think. So go fell, chess fell. think math will also, I don't want to say math will fall, sounds so bad, but because it's verifiable and clear and the rules are so clear, I think that will be probably the next go, so to speak, even with very similar methods, surprising me maybe or not surprising me. â So are you asking will we solve these hallucination problems and these spitting out data set problems? Why not? Is there something fundamental? I don't know. think I... Yeah. But you know, it's...
Ravid Shwartz-Ziv: But also for other domains, when you don't have very strong verifiers.
Julia Kempe: That's the question, I think we are now coming to conclusion that as soon as something is somewhat verifiable, we're doomed and the machine will just go off, play against itself, understand things, prove and verify, prove and verify, play, like think of it as a game between a prove and verify and eventually it will, you know, bootstrap itself out of human knowledge and go, think that is clear. And then the question is, right, what about the semi-verifiable? that are not fully... yeah so I think progress will come step by step but I do believe it will come.
Ravid Shwartz-Ziv: But even like, I don't know, for code for example, right? This is like maybe the second most verifier domain after math, right? Like we see that like if you still like live coding things, right? Like we are still not there. What do you think are the missing parts? Like what do you think like we still can't have like end-to-end very good efficient applications, for example.
Julia Kempe: â I think there's the human in the loop still, right? And the human in the loop, if it's an uncritical human, everything is doomed because I mean, I see it myself on the web code. It does a lot of tasks already very well. Like my webpage is entirely web. just say you my publications go on archive have a look quickly check open review go right here's my cd by the way and it does right so obviously these semi-simple things are â And what's hard, think is still hard, but I actually don't think that's unsurmountable, but we can even name it what's hard, right? It is when it comes to code maintenance. what's important about companies that make code is not so much to turn out the application, but to maintain it, right? Through all kinds of versions and changes of the environment. This, think it's just, it wasn't trained this way. I think that that's something that we'll see next probably. And it has a bit of a tendency to over go overboard. I mean the code, mean if you start white coding, I don't know if you have done it, it's like piling up and it just goes in yet another and another and another. So the pruning piece is a bit lacking. But I think that will also, mean it's getting better. We have compacting, we have skills, we keep saying, okay now sit down and spend as long on. streamlining the code, make it. I actually see this, I honestly believe this is like, it's not even years away, I think this is months, this kind of, these fallacies. The security fallacies, similarly, right, whatever you teach a good programmer, you eventually can teach Claude or whoever is your favorite coding agent, right, these principles. And it already goes, we see it when we formalize in Lean, you can go all overboard or we give it skill files that says these are the principles. I mean, don't break them. I mean, you have to... beautiful code this way and that way. So honestly I do not agree that this is Anserma. I think this is something we'll see. We'll get better and better very quickly.
Ravid Shwartz-Ziv: What do you think about all the human intervention that like people for example like now you you have your your â your agent and in WhatsApp or something like that. And I just saw on LinkedIn that someone convinced the agent, the WhatsApp agent that he's a dog and now he needs to give him some treats. And then after he promised him a good treat, then he revealed all the secrets, all the APIs and things like that of his users. So do you think like... How to say? We have here mix, right? Because like our agents are talking with humans and need to interact with humans from one hand, they also want that they will be very like secured and like we wanted to write a secured code. it's humans like, we know that like humans can write good codes or like, and also can interact with people. But do you think like agents can do both also?
Julia Kempe: Yeah. I agree with you that it's now the Wild West. There are security breaches and leaks and the agents going off and giving your credit card number to everybody who's asking. It's now happening all the time. You saw the latest leak from Antropic. That was the most shocking to me when they couldn't prevent their own security breach that leaked their harness that made cloud code from cloud. That is shocking. If Antropic can't prevent it, means â The agents have kind of grown capacities that we currently don't control. But I think just like, you know, historical lessons show that yes, there's the wild west and eventually we'll we get the security. I mean, there will always be security leaks, but they won't be so obvious, right? It is a cat and mouse game that we now need to play with. know, now we give it a different way of doing this and different protocol and sandboxing. mean, everything is now sandbox. I don't let the thing. â I mean, I don't let it touch my email because whenever I use it on my own computer, I really sandbox it, right? I just say, you you can't have it here. That's your boundaries. that, you know, I think that will be more more put in place. It's just people were so carried away by Claude and what it can do that I think they forgot for a while that, you know, that you need to put the right boundaries. I think we're just in the very early stages of these super powerful agents. â And I think this will get solved, I think, to some extent. At least the obvious ones, right? This will go off. This will get off.
Ravid Shwartz-Ziv: And what do you think about regulation? Do you think we need regulation in the future?
Julia Kempe: Regulation by whom? mean like governments regulating what? So it's a good question. I do believe regulation is a good thing to some extent. mean, yes, regulation.
Ravid Shwartz-Ziv: You tell me. as a European.
Julia Kempe: I do and that leads us back to what we want to prevent in this AI. Alan, as you mentioned, you are what did you call them? I hate AI. â Of course, we will need regulation. has to be meaningful and â implementable and it needs not stop progress. So these are so many demands. It's not possible, right? All these all these three are probably not possible. We'll have to have trade-offs. Yes, we need to regulate. No question, right? Otherwise. as I mentioned, will, you know, data privacy is important and we need to make, but I mean, I have to say as a European, these regulations already, it's not, people are not super surprised, right? I mean, these questions we've already had with the internet before and now it's just taken to a different level and I think regulation is absolutely needed for, â you know, protection of human rights and human privacy.
Allen Roush: Do you well, well, OK, so so the importance of regulations is certainly something that I think all three of us might agree on. But it seems that there's a strong argument for keeping it deregulated to have maximum innovation now. So that way, you know, forces that you might be even more fearful of, for example, authoritarian China, right, from developing it faster and more efficiently. And with those dynamics, I feel â that we're in back in a kind of bad situation again. So.
Julia Kempe: Fair enough. mean, again, we have this also with the internet and with data protection and so on. It depends again, what do you want to regulate? mean, there's this whole debate, should we open source our models? Right. And then the argument goes, we shouldn't because some, you know, whatever, some country or whoever will take it and then not have regulations and do something bad with it. Right. This is how it goes. Usually the argument on the other hand, or open sourcing also allows all the other players to make them. models better, safer, do the right research. â I think that eternal debate will stay and we'll always write some sort of compromise between the two, I think. I guess I'm not saying anything new, it's an old question, I'm saying. It's not a new question. And we will make these compromises â as they come along, models more powerful.
Ravid Shwartz-Ziv: So we're almost out of time, but I have two more questions. One is like, if you're now like, for people that want to start their own company, startups, what do you think they should do? What topics? And the second one is for PhDs that's starting their â PhD now, what do you think they should work on?
Julia Kempe: Yeah, this is, mean, you can even ask what we should work on, right? Because certain questions we worked on now are either solved or in the hands of people who scale much more, right? And so on and so forth. So let me start with the students, because this is a question I really think about because, you know, we all hire students in academia and in the US. This is a five year endeavor. I mean, these students stay with you for five years. Now, in the past, when I hired students, I could kind of tell them how their PhD would look like. I would say, look, the first year, first you take courses, then you start being you know co-authoring papers but there is probably it's my idea and then you go and do the the ideation comes from somewhere else and then you execute and then you do this and eventually you will have your ideas and and you know and then you you go your own route you do whatever that's why vision etc I think this is now, actually, I'm being very frank with the PhD students and I say frankly, you know, in these five years, let's interpolate backwards from the last five years. cannot promise you that I know how your PhD will look like, but I do know that you, even your first and second year will not look like they used to look like for, you know, my previous generation of PhDs because in all honesty, â Claude is doing all that stuff. The pictures, the little experiments and running the ablation of the model, know, this checking the hyperparameter search. It's a real problem. because â there's a bit of a decision making on my end. Do I give a task that I know Claude can do to a student just like sports, so to speak? â Right. So that's just the mechanics of it for a student, how to learn. It's difficult and it's very hard to predict. But so then let's ask ourselves what the research we should do. Right. And I think in academia really let's not solve the problems that the big companies anyway thought. Right. I mean, let's not think of yet another loss function for reinforcement learning unless it is connected to things we care about, I said, data efficiency or some new crazy architecture that in that scaling race you cannot test because it's too much of the optimization manifold to work immediately so companies won't have time to pursue it. So we need to do the out-of-the-box, I mean the research that is not a direct continuation of what's done now. I find robotics very exciting and world modeling spending some time thinking about it or learning about it and thinking about it. And I think it's still in a very messy state. This whole idea of, again, I'm coming back to your information bottleneck, the whole idea of representing the world in a much smaller space, latent space in like our brain does, right? It goes to some sort of bottleneck. This question is not well solved yet. And this is why our AI is so large and thinking about this more. And by the way, Alan, you mentioned the social sciences at some point, and I didn't come back to this. â But you, Ravid, asked me what is reasoning, right? And honestly, I don't know the answer. As a mathematician in the past and computer scientists actually didn't have to think about it. And I think now we're facing a reality where these questions are becoming interesting again, cognition. What is this? What is a machine? What is a human? And social sciences, I think we are back to Renaissance, right? I think they should also have their Renaissance for that reason, because we are now facing these questions that are uncomfortable also. mean, people keep telling me what's this reasoning thing. And I have to start thinking what is reasoning really and how did evolution come about it? What's the philosophical implications? All that stuff is back also. So it's not only the Renaissance. also social sciences and humanities are back, right? So coming back to the students, so they need to learn something broad and do the things that industry cannot do. I'm very disciplined because it's very easy to get into those, did they do the reinforcement loop right? Let's just add another, let's just reweight the loss a little bit better and see what it does. I'm not sure this is really impactful. I have done it myself, so I'm not sure how impactful this research is if you are an academic and you're doing a PhD. You must do this stuff.
Ravid Shwartz-Ziv: you
Julia Kempe: we cannot do and think and have yours, you know, talk to your social science buddies and your humanities buddies. And so coming up with also the brain, right? We have a solution, as I said, learning about the brain, think cognitive science. So being very broad as a student in the beginning will pay off, right? Take more classes, take some neuroscience classes, go to biology. â All this I would tell my students and I will tell this generation of my students. So these are the topics and then make the connections, bring them together and use Claude as much as you can to have the figures done and all that stuff that we. don't need anymore to be driven by a human. Startups, what are the topics again? â I think robotics, if you have some good ideas, â I think security, as I said, it's big, I mean, it has to come. It's not there. All these security breaches that we see, there need to be more solutions. What else? â
Ravid Shwartz-Ziv: What do you think about finance? â
Julia Kempe: There's an interesting question. So I worked in finance for a while. I don't talk much to finance institutions. I know they are very conservative in general, so probably slow in adoption. And there's a reason for it because the margin of error, there isn't. When you make a mistake, it's extremely costly. costs you your life. The average half-life of a hedge fund I heard is 18 months. And that's not because they do a bad job in daily trading, but because they forget the tail event or something, right? mean, these kinds of extreme events are often wrongly. So finance is conservative for that reason. Finance has a lot of data, right? And the data has patterns. of course there have been algorithmic trading is, you know, we go back to this late seventies and eighties is there. It must profit. I'm sure. I don't know again what they're doing, but surely â machine learning should help find these patterns. That's what it's really good for. â Then we can think further because what is finance? Why is there profit in algorithmic trading, right? Because there are market inefficiencies, obviously, right? Because if they weren't, there's always the see, I forgot what the law is, law of with markets or I forgot, yeah, you know what I mean, the law markets being calibrated and you shouldn't be making profit in principle because the market already anticipates it moves, but you do make a profit. mean, I work in a company in finance that made a ton of profits, which means markets aren't equilibrated, but they should be because why should a company make a lot of profit in finance company? It's at the expense of somebody, right? Obviously like the pension funds or somebody. So if we can contribute to making the markets actually balanced by having machine
Ravid Shwartz-Ziv: Yeah.
Julia Kempe: learning, spotting the patterns, immediately balancing them out, it will become less profitable, but probably better for humanity. So let me turn it into a positive story. Yes, so in that sense, finance should use as much AI as possible to make the markets finally balanced, cease to exist in some ways or cease to have so many profits, but make it better for everybody else who's participating in the market.
Ravid Shwartz-Ziv: You Probably it will not be like that, right? Probably like there will be some companies that will earn a ton of money and all the others.
Julia Kempe: I don't know. I mean, but you know, there are serious market inefficiencies and if AI will help get some of them away fast and make the market more efficient, that's a good one.
Allen Roush: I'll point out that this is also the argument for prediction markets, right, which is that they add insider information to the economy, but now we just have everybody betting on everything with polymarket, right, and callic. What's the other one? Yeah.
Julia Kempe: you.
Ravid Shwartz-Ziv: But the problem there is that you bet on someone will die, right? Or things like crazy like that. It's immoral to do it. Even besides the inside information.
Julia Kempe: Yeah, I feel it.
Ravid Shwartz-Ziv: Okay, I think we're out of time. Do you have anything else that you want to add, to promote? Anything?
Julia Kempe: That's a good question. I actually don't have much of personal agenda here. mean, again, an appeal, I think what is close to my heart is, again, maybe two things. I mentioned them already. One was the message to the mathematicians and two, generally, people who are somewhat still in denial, really use the tools. Just have a look once, because if you don't, somebody else will and you'll be worse off. You have to use the tools, have a look. And then my second message again is as much as they can be doing evil in one hand, they can do good in the other hand. And I think these grassroots initiative, have this, these friends who always had hacking for good and all these things. Now it should be, you know, there should be many more AI grassroots kind of AI for good initiatives and they will be able to do things, right? Good things. you know, there is so much to counter all the things we don't like about AI. There is so much in the human nature that can be leveraged with AI to do good, so to speak. These are the two points. Other than that, I wanted to thank you guys. an amazing job. â I think the podcast generally is a wonderful medium and you guys are â spreading the word of various people, which is fantastic. I love your title. I think it's very much to the point, as I several times said. So thank you very much.
Ravid Shwartz-Ziv: Thank you, thank you so much that you came, it was really great. Always great to talk with you.
Julia Kempe: Magic.
Allen Roush: It's pleasure to meet you, Julia.
Julia Kempe: Likewise. Thank you.
Ravid Shwartz-Ziv: and thank you for the audience for listening see you next time