Technical Alignment Impossibility Proofs
An independent research project focused on proving formal impossibility results in AI alignment using theoretical computer science methods, led by Alexander Bistagne as a Ronin Institute Fellow.
An independent research project focused on proving formal impossibility results in AI alignment using theoretical computer science methods, led by Alexander Bistagne as a Ronin Institute Fellow.
People– no linked people
Updated 03/19/26Funding Details
Updated 03/19/26- Annual Budget
- -
- Current Runway
- -
- Funding Goal
- -
- Funding Raised to Date
- $176,070
Org Details
Updated 03/19/26Technical Alignment Impossibility Proofs is an independent research project led by Alexander Bistagne, a 2021 UCSC graduate and Fellow of the Ronin Institute for Independent Scholarship. The project received a $170,000 grant from the Survival and Flourishing Fund (SFF) in the 2022-H2 S-Process grant round, funded by Jaan Tallinn, with the Ronin Institute for Independent Scholarship Incorporated serving as the receiving charity and fiscal sponsor. The project's central research direction, titled "Alignment is Hard," seeks to formalize the argument that AI alignment testing is computationally intractable. The core claim is that if we cannot prove a program will loop forever, we cannot prove an agent will care about us forever. More formally, under specific conditions -- when an agent's environment can be modeled with discrete time, the agent's architecture is agentically-Turing complete, and the agent's code is immutable -- testing the agent's alignment is CoRE-Hard. Bistagne posted his paper "Alignment is Hard: An Uncomputable Alignment Problem" on LessWrong in November 2023. The paper was submitted to and rejected from the Alignment Forum. He also received a smaller Manifund grant of approximately $6,070 for the same line of research. A second research direction, "Control By Committee," explores alignment of ensembles of agents through multi-agent structures, an idea Bistagne had been developing since 2018-2019. The project's Manifund page was subsequently closed, with Bistagne stating he could not in good faith ask for more money without an environment conducive to AI safety research. As of late 2025, the project appears to be inactive. The original Ronin Institute for Independent Scholarship (NJ) dissolved in September 2024, though a successor organization (RIIS 2.0) was incorporated in Sacramento, California in April 2024, where Bistagne remains listed as a Fellow.
Theory of Change
Updated 03/19/26The project's theory of change is that establishing formal impossibility results in AI alignment can redirect the field's efforts more productively. By proving that certain approaches to alignment testing are computationally intractable or undecidable, the research aims to demonstrate that aligning black-box AI agents may be fundamentally impossible, potentially convincing researchers to focus on alternative approaches such as designing AI architectures that are provably aligned by construction rather than attempting post-hoc alignment verification.
Grants Received
Updated 03/19/26Projects– no linked projects
Updated 03/19/26Discussion
No comments yet. Be the first to share your thoughts.