Sam Eisenstat's independent AI alignment research program, focused on mathematical foundations of agency, logical uncertainty, concept formation (condensation theory), and causal modeling at different levels of abstraction.
Sam Eisenstat's independent AI alignment research program, focused on mathematical foundations of agency, logical uncertainty, concept formation (condensation theory), and causal modeling at different levels of abstraction.
People
Updated 05/18/26Lead Researcher
Funding Details
Updated 05/18/26- Annual Budget
- -
- Current Runway
- -
- Funding Goal
- -
- Funding Raised to Date
- $736,000
Org Details
Updated 05/18/26The Eisenstat Research Program is the independent AI alignment research effort of Sam Eisenstat, a mathematician who has been a research fellow at the Machine Intelligence Research Institute (MIRI) since April 2017. The program focuses on foundational mathematical questions about reasoning, agency, and concept formation that are relevant to building safe AI systems. Sam Eisenstat studied pure mathematics at the University of Waterloo, where he conducted research in mathematical logic. During his undergraduate years, he received honourable mention in the William Lowell Putnam Mathematical Competition (ranked 40th out of 4,113 contestants in 2013). Before joining MIRI, he worked at Google on the automatic construction of deep learning models. At MIRI, Eisenstat has made several notable contributions to agent foundations research. His untrollable prior demonstrated that there is a Bayesian solution to one of the basic problems that motivated the development of non-Bayesian logical uncertainty tools (culminating in logical induction). His logical inductor tiling result solved a version of the tiling problem for logically uncertain agents, which was described as perhaps MIRI's most significant 2018 public result in the agent foundations category. He also co-developed a formulation of Bayesian logical induction with Tsvi Benson-Tilsen. More recently, Eisenstat has developed condensation theory, a mathematical framework for understanding how agents create concepts to understand their world. Unlike standard compression, condensation is concerned not with total code-length but with how easy it is to use an encoding to answer questions about the data, forcing the encoding to be organized in a conceptually meaningful way. He proved a theorem showing that different approximately-efficient condensations will posit approximately-isomorphic random variables, meaning two agents who condense data well will arrive at similar conceptual frameworks. This work was presented at the MAISU conference and published on the AI Alignment Forum. Eisenstat also co-authored a paper on factored space models with Scott Garrabrant and others, proposing an alternative to traditional causal graphs that can represent both probabilistic and deterministic relationships at all levels of abstraction. This paper was submitted to arXiv in December 2024. Additionally, he collaborated with Tsvi Benson-Tilsen and Nisan Stiennon on cooperative oracles, a refinement on reflective oracles. The program is fiscally sponsored through both MIRI and Ashgro, Inc. (a 501(c)(3) nonprofit that provides fiscal sponsorship to AI safety projects). It has received funding from the Survival and Flourishing Fund, first as "Eisenstat Research Directions" in the SFF-2024 round and then as "Eisenstat Research Program" in the SFF-2025 round.
Theory of Change
Updated 05/18/26By developing rigorous mathematical foundations for understanding reasoning, agency, and concept formation, Eisenstat's work aims to provide the theoretical tools needed to build AI systems that are safe and aligned with human values. His condensation theory addresses how agents form and share concepts, which is relevant to ensuring AI systems develop understandable and compatible conceptual frameworks. His work on logical uncertainty and decision theory contributes to solving fundamental problems in how AI agents should reason and make decisions under uncertainty. The factored space models research addresses how to reason about causality across different levels of abstraction, which is important for understanding and controlling the behavior of complex AI systems.
Grants Received
Updated 05/18/26Projects– no linked projects
Updated 05/18/26Discussion
Key risk: The main concern is that his MIRI-style foundational agenda (condensation theory, cooperative oracles, factored spaces) may not translate into practical leverage on current frontier ML systems, so uptake and counterfactual impact could be low, especially given the throughput limits of a one-person program.
Case for funding: Sam Eisenstat has a rare track record of producing novel and rigorous agent-foundations results (untrollable prior, logical inductor tiling) and is now pushing condensation theory and factored space models toward principled concept formation and multi-level causality, a high-leverage theoretical route to interpretable, alignable agents that few researchers are capable of advancing.