May 7, 2025

Academia to Agentic AI

Ion Stoica of Databricks with Charles Packer & Sarah Wooders of Letta

I recently hosted a conversation with Ion Stoica, Executive Chairman and co-founder of Anyscale and Databricks, alongside Charles Packer and Sarah Wooders, co-founders of Letta. Ion also leads Berkeley’s Sky Computing Lab, where Charles and Sarah were PhD students and researchers. At Berkeley, they created MemGPT, an open-source project that enables AI agents to have memory and state. In addition to being a 3x founder, Ion has also been a creator or key contributor to many impactful open-source projects, including Ray, Apache Spark, and Chatbot Arena.

This conversation explores the transition from academia to entrepreneurship, the value of open-source in building developer communities, and how Letta is revolutionizing AI agent infrastructure. The founders discuss the future of AI agents, scaling compute beyond inference time, and their vision for more reliable AI applications that can be trusted in production environments.

Transcript

This transcript was edited for clarity.

Astasia: Welcome, thank you all for doing this. It's a pleasure to have you, you all have come from academia and moved into entrepreneurship. What were some of the advantages coming from academia when you started your journey?

Charles: Yeah, startups and academia are actually pretty similar in many ways. In both cases, it's very unstructured. You have to pave your own path, think very deeply about very hard problems, and no one's really gonna tell you the right answer. I think that sort of environment, where you have a lot of time to think and a lot of time to work on a very unstructured problem, is in many ways pretty similar to working with a startup idea.

Astasia: That makes a lot of sense. Maybe Sarah, what were some of the biggest challenges coming from academia that you worked through?

Sarah: I think the nature of the work is really different. In grad school, you spend a lot of time on research, getting to read papers, but once you have a startup, there's of course a lot more time on just operational things like recruiting, marketing. So I think that was a really big shift, in terms of the type of work that we're doing day to day.

Astasia: Ion, I can imagine you have thoughts, you've had many companies come out. 

Ion: I think there are similarities and differences. I think as a faculty, you still need to recruit and raise money, and maybe lead a lab, projects, and things like that.

So that's quite similar. I think the difference is pretty clear: academia is the ultimate incubation environment. A lot of freedom. What you're going to do, what you want to work on, and the scope of the projects you're going to work on. It's very diverse.

When you start a company, you will, by necessity, be a lot more focused. And yes, at the beginning, the companies are somewhat unstructured. Although you need to do some more operational work, but as a company grows, it's going to be more and more structured. The nice thing about a company is that you can put a lot of resources behind your idea and your vision. Not only for 4, 5, or 6 years, which is how long a PhD takes, but, for five to 10 years or more

Astasia: Do you have any advice for other academics or students who want to make the transition to becoming a founder?

Charles: Yeah. For us, the transition was relatively straightforward because we've been working on this open-source project. As you know, Berkeley, especially Sky Lab and RiseLab, are pretty famous for their open-source AI projects.

That was something that accompanied the research paper that we wrote at Berkeley. There are a lot of people who were using the research product, which was a departure from the pure research itself. So yeah, I think actually starting the company around the research made a lot of sense.

And I think also I. It was pretty clear to us in the PhD program while doing research that state and memory, those two things in particular were really the limitation of LLM-driven Ao. And those are things that need to be solved both fundamentally at a research level, but then also, from a product perspective.

I don't think developers want to use LLMs. They want to use agents. Yeah. And agents are much more than just the LLM.

Astasia: I remember when I was first getting to know you, that you were so involved in the community and getting their feedback on thoughts on the project,  and how it could be improved, and what the direction could be.

You really went in with that builder, entrepreneur mindset, if you open source something, to get feedback early in the journey from users. Do either of you have advice for other researchers or academics who are thinking through possibly becoming a founder? Words of wisdom for making that choice?

Sarah: Well, I mean I think like Ion said, being a grad student is an opportunity to be able to explore a lot of ideas, work with a lot of different people. I think it's a really good environment to both find, you know, co-founders, people who you're going to work with. And also it's a really good opportunity to think about what the really important problems are to be working on. 

Because one thing that Ion emphasized a lot during my PhD was that it's very important to choose the right problem to work on. And that applies both for research and startups.

Ion: So yes, you'll do a company as a researcher from academia. If you are very passionate about what you're doing, about the problem you're solving, and you want to take it to the next level, right?

That's what it is. And you think that there are other ways, whatever you are going to do is going to have an impact. Maybe, as a company, stake whatever you are doing and push for it. So, if you really believe in what you are doing, and you want to take it all the way to maximize the impact, that's a very good reason to do a startup.

Astasia: Charles alluded to this, that your work at the Sky Computing Lab has generated many very impressive open source projects. What makes your lab stand out so much? What do you think is the difference with your lab for open source work compared to others?

Ion: Yeah. So there are a few things, right, that it’s... many things have to come together. One thing is to have great students. Everything starts with that, right? The other thing is that obviously, you know, I am a big believer in being pretty careful about the problem you will work on, right? 

We have limited time and limited resources. So everything, the problem you're going to work on is going to make a big difference in the outcome. So one thing you look for, in general, are trends, at the application level, at the infrastructure level, all the way to the hardware.

There are these trends that you have limited ability to impact, right? So you better understand them and see how whatever you are doing fits in these strengths. For instance, the need for computer language models, or deep learning in general, came from all the way in 2012. Now it’s growing much more quickly. And the abilities of a single machine. Things like that. Or, you look at what Letta is doing, it's a fundamental thing. Like many control systems, memory is an integral part of how we accumulate information and how we use it to make future decisions.

The third point, I love open source because I think it's one way to maximize your impact. You have a chance as a small group of researchers in academia to have an impact because the open source is not only about writing the code, it's about building a community.

You have a chance to build a thriving community to work on the same project, have the same goals, from the same vision. So that's one way to amplify your impact. The last thing I would say is that if you have a successful open source project and you want to start a company, you do have an advantage because when we're talking about the product market fit, half of the problem is solved if the open source is successful. It means that there are users, and if there are users, they're using it because it solves some of their problems. Now, in the second half, it's that they can contribute to the effort.

Astasia: It is also a great segue, to think about open source as an academic researcher. Do you have any advice when academics want to open source their work, but maybe have a potential intention later to commercialize it? Any guidance on that process?

Charles: Yeah. One very common piece of advice, I think, is pretty accurate: you shouldn't open-source stuff that you expect to potentially take away later. Once you make something free and open source, you should expect that people want it to be free and open source indefinitely.

If you have certain aspirations about monetizing your product, monetizing your code, you should think ahead of time, like what you want to build at the open source. And if you can build it in a meaningful way where, for example, you can have an open core and then build services around it that you can monetize.

Of course, I think the great thing about open source is that a lot of people also just do it for the love of open source, building with a community of other developers. And in that case, I think it's great to just open-source everything. It matters what you wanna build.

Astasia: Ion, what techniques or strategies did you find particularly useful from your time at Databricks and Anyscale when thinking about open source?

Ion: One particular technique or insight I had, quite recently, I am ashamed, I didn't have this earlier on. If you look at a lot of open source companies, some of them are successful, some of them are not right, or they're not as successful as people hoped.

When you do open source, you have users, you solve a problem. Now you need to generate revenue, right? To get people to pay for some value. It’s about how deep your stack is.

The deeper the stack, the more capabilities you have to innovate within the stack. I'll give you an example, like Docker. Docker is very thin. It's very challenging to innovate in that area. Something like Spark is much deeper. It's almost all the way from writing your queries in SQL to the hardware.

So there are many more things you can optimize, so you can provide better performance, better scalability, lower cost, right? The second thing, if your open source project is more like a framework, a distributed system, or something like that, then one reasonably easy way to start this is by offering a managed service.

Because it's harder to manage that framework system, and then if you provide people a managed service, that's one good first step, right? Which these kinds of distributed systems allow you to do. But eventually, you have to innovate in this stack, and the deeper your stack, it gives you a lot more options.

Astasia: I really liked your example of thinking about a managed service and how moving from a single user and single node environment to a distributed system at scale for production, and kind of having things in the open source that helps with adoption for the individual, but then empowering them through a commercial offering, you could have to have it across the entire company. Sarah, I'm curious to learn more about: how does Letta fit into the AI agent stack, and what is your vision for how Letta is having an impact across AI today in future applications?

Sarah: I think when we started the company, there was actually a lot of work that had been done in our lab around LLM serving, like VLLM is also out of the same lab, and I think we were thinking, why are developers interfacing with LLMs? It makes a lot more sense if developers are instead interfacing with agents as services as opposed to just LLMs as services or models of services.

Over time, there's gonna be a lot more complexity that develops at this layer of memory and state management. Basically, optimizing context windows to get the best possible performing agents. So that was the original vision that we had when we started the company, and also what we've continued to work on.

I think the way that Letta is designed is a bit different from other frameworks, where Letta is basically something that's deployed as a service. It's not just an agent library. So, as opposed to just importing Letta, you actually need to run a Letta server or connect to Letta Cloud.

And I think in the future, we envision that instead of interfacing with LLM APIs, managing their own state and memory, developers will eventually be interfacing with these agents, APIs, and agents of services.

Astasia: And with that in mind, and in your vision for people building across AI agents, could you talk through some of the work that the team has been doing here?

You were giving me a preview before we started rolling. Maybe either of you can talk briefly about what's coming next.

Charles: One of the amazing things about working in AI, both in research and production as a startup, is that now the abstractions are settled. So I think we all have a very strong intuition that the LLM is far from a complete agentic system, and there are many more pieces we need to build around the LLM. The LLM is like the CPU or the chip inside a computer. So we have all these companies that are making these exceptional chips. We have… it's kind of like the microprocessors, right?

But no one has made the computer. And the computer in this case is really this platform, or this infrastructure layer that Sarah's talking about, that is LLM agnostic. And it keeps state, it keeps memories, and it lets you build agents that have state, and you can run them as services.

With that in mind, we have many more ideas beyond MemGPT as to what the real building blocks should be when you're building these systems. I think MemGPT was very much like a V1 in this direction of the operating system for LLMs. But there are many more ideas, in particular around scaling compute.

I think scaling compute at inference time has become an extremely popular idea. I think for very good reasons. It's another way to push the limits of intelligence, but I think there are also very interesting ideas you can have around scaling compute, but not at inference time. Instead, similar to how a brain is always on, even when a human is not actually conversing with anyone.

Similarly, if you have these LLM agents and they're running as services, they're always on in a data center. Why not just have the agents running all the time, taking advantage of the compute they have? That's one idea we've been working on. We've also been working on some other very exciting directions around voice.

So, context management: When you have voice settings, you want very low latency. You want these agents to be able to apply very quickly, which means you can't actually put that much information into the context window. And I think that's actually the perfect use case for an LLM OS because you can use the OS to basically pack the perfect amount of tokens into the context window so that the agent that is driving the voice is extremely fast.

It has all the information it needs, and it has a very tight top context window. But you have this larger agent or this memory manager that's making sure that all the information that the voice agent needs is always in the context window. And that's another twist on this LLM OS idea that we introduced with MemGPT.

Astasia: What I love about this discussion is it's clear this inspiration that you have, for what you're building at Letta from UC Berkeley and being PhD students – doing this applied research to really push the bounds of what is possible with AI today, and thinking critically about the next steps that we could be going through.

It's very inspiring. You guys have direct experience in academia working on the bleeding edge of what's coming, as well as real-world customers. If we look ahead two years from now, is there a critical area of AI that will be very important that people aren't talking a lot about today, but is on your radar?

Charles: Yeah. So I think this scaling compute idea, not at inference time, is gonna be huge. I think this will be the new axis, the new scaling law that everyone tries to squeeze the most juice out of. All these ideas, especially scaling around test time or adding inference time, they're happening or they're becoming popular because we are saturating the performance of the base models.

And I think all these techniques, even RAG, but all these techniques around squeezing the most juice or the most intelligence at these LLMs, they're all unified by this concept of the LLM OS, or kind of the context window manager.

At the end of the day, all the LLM is some machine. A stateless machine that takes in tokens and outputs tokens. We're basically trying to make very complex compound systems out of this machine. So much of the research and the excitement in AI progress will shift above the model, over the next few years. And I think the real area of focus will be things like scaling compute, not at test time, but at sleep time.

And then also just codifying all the abstractions for the LLM OS – what are the real system-level abstractions? I think those will all become less alchemy and more engineering.

Astasia: That's cool. I like that. Less alchemy and more engineering. Ion, your lab continues to push the bounds of what's happening next. What areas do you think are critical that you all are working on?

Ion:  So, following up on this, I think that actually the biggest challenge in AI is going to be to continue building reliable AI applications, period. You need an AI application you can trust, right? It doesn't depend on the sophisticated things you can do if you are not reliable, right?

And the more valuable a task is, the more reliable it has to be. There has to be a way to verify and, therefore, to specify these tasks. You need to specify them to verify what verification means.

Verification means that this task has done its job according to some specification. This is true for humans as well, right? You tell an engineer, do X, this here is a PRD, and then you are going to get a program which actually, hopefully satisfies the requirements in the PRD, right?

But there is another thing because once you have the ability to verify, this makes  the test time compute much more powerful, right? If you have a precise goal, this is true for humans as well, you can make much more rapid progress. That's why the most successful results in test time compute have four domains. You do have pretty precise goals to solve, and mass problems, like coding where you have unit tests and things like that. But overall, expanding the kind of domains that can build reliable an application you can trust – that's going to be the key.

Right now, successful applications have a human loop, copilot, customer support, all of these things. You have a human loop, which basically acts like the verifier and makes the ultimate decision.

Astasia: So it sounds like you're excited about scoring systems and reward. Maybe shifting gears a little bit to do a lightning round, to wrap things up. To kick things off, the first question I have is, what's one must-read paper right now?

Sarah: I think the sleep time compute paper.

Astasia: So that's coming soon. Any other suggestions?

Ion: I would say DeepSeek R1, the paper.

Astasia: Great suggestion that’s on-theme with what we're talking about. Next question. What's your most controversial opinion about AI? 

Charles: Ion?

(Laughter)

Ion: You are a dispatcher of the question. This is not necessarily controversial, but I think you are going to see fast progress in the domains I mentioned earlier on. You are going to see that only in the domains where you can have precise goals can you verify the outcomes. You are going to see disproportionate progress in these kinds of domains versus the others.

Astasia: It makes a ton of sense. We're seeing a lot of that with AI for software engineering practices.

Ion: It's the most popular application today. The fastest growing compared with other AI applications.

Astasia: A hundred percent. And last question, maybe this will lead to more controversy. Do we have AGI in two years?

Charles: I think by most definitions of AGI, definitely not. But yeah, I think, as Ion said, in narrow domains, what you can do with techniques like was shown in R1, is definitely going to lead to some pretty intense progress in some verticals.

I think the generally intelligent assistant that is straight out of a sci-fi movie or something, I think that we're a little bit farther away from. Hopefully, at Letta, we're going to make some serious progress towards that. But yeah, I think two years is cutting a little close for that.

Astasia: Sarah, do you agree?

Sarah: I think, AGI, everyone has a very different definition for it. So it's a little hard to say. I do think the capabilities will get a lot better in the next two years. Even if improvements in models stall, I think we will just get better and better at using the models. That's a big thing that we work on at Letta, optimizing the context windows to squeeze the most juice that we can. So, yeah, I think that we will see a lot of improvements. We're just starting to see the beginning of agent capabilities.

I think, especially as the models get cheaper, there'll be a lot more scaling. We can do a lot more interesting things than we can do now. So I think there'll be a lot of progress. But I don't know if you would call it AGI.

Ion: So I think the nice thing about not having a precise definition is that everyone is going to be right about this question in two years. There'll be people who are going to claim we have AGI, and there are people who are going to say, this is not AGI for sure.

Astasia: Well, Ion, Sarah, Charles, thank you so much for sitting down with me today and talking through the path for academics to become entrepreneurs, the role of open source, and what you're working on with Letta for providing state and context to AI agents.