The Alignment Research Center (ARC) is a nonprofit research organization founded in 2021 by Paul Christiano, dedicated to aligning future machine learning systems with human interests. ARC pursues theoretical research on producing formal mechanistic explanations of neural network behavior, combining ideas from mechanistic interpretability and formal verification. Their work centers on intent alignment, ensuring ML systems are genuinely helpful and honest rather than potentially deceptive. Key research areas include heuristic arguments, Eliciting Latent Knowledge (ELK), mechanistic anomaly detection, and low probability estimation. ARC previously housed ARC Evals, which spun out as the independent nonprofit METR in late 2023.
Funding Details
- Annual Budget
- $9,050,000
- Monthly Burn Rate
- $754,167
- Current Runway
- -
- Funding Goal
- -
- Funding Raised to Date
- -
- Fiscal Sponsor
- -
Theory of Change
ARC believes that as ML systems become more capable, current alignment approaches may fail to scale, potentially leading to systems that pursue goals misaligned with human interests. Their theory of change centers on developing rigorous theoretical foundations for alignment before superintelligent systems arrive. By creating formal mechanistic explanations of neural network behavior, combining ideas from mechanistic interpretability and formal verification into heuristic arguments, ARC aims to enable reliable detection of when AI systems might behave in dangerous or deceptive ways. This theoretical groundwork is intended to inform practical alignment techniques that can be applied by AI labs building frontier models, ensuring that powerful AI systems remain genuinely helpful and honest rather than merely appearing aligned.
Grants Received
from Survival and Flourishing Fund
from Open Philanthropy
from Long-Term Future Fund
from Survival and Flourishing Fund
from Open Philanthropy
Projects
No linked projects.
People
No linked people.
Discussion
Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.
Details
- Last Updated
- Apr 2, 2026, 10:00 PM UTC
- Created
- Mar 18, 2026, 11:18 PM UTC
