TruthfulAI, led by Owain Evans, conducts research aimed at producing safe and aligned AI systems. The organization focuses on understanding how language models develop situational awareness, engage in deception, and exhibit out-of-context reasoning. Key research contributions include TruthfulQA (a benchmark measuring model truthfulness), work on emergent misalignment, and research on subliminal learning. TruthfulAI is a fiscally sponsored project of Rethink Priorities and also mentors researchers through programs such as the Astra Fellowship and MATS.
Funding Details
- Annual Budget
- -
- Monthly Burn Rate
- -
- Current Runway
- -
- Funding Goal
- -
- Funding Raised to Date
- $1,171,120
- Fiscal Sponsor
- Rethink Priorities
Theory of Change
TruthfulAI believes that making AI systems reliably truthful and transparent is a key lever for reducing catastrophic AI risk. By developing benchmarks and conducting empirical research into how models deceive, develop situational awareness, or behave misaligned in subtle ways, the organization aims to provide the field with tools and findings that can inform safer training practices and evaluation standards. The causal chain runs from foundational research on model behavior to better evaluation methods, improved alignment techniques adopted by labs, and ultimately to AI systems that are less likely to deceive or act against human interests.
Grants Received
No grants recorded.
Projects
No linked projects.
People
No linked people.
Discussion
Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.
Details
- Last Updated
- Apr 2, 2026, 10:08 PM UTC
- Created
- Mar 19, 2026, 10:30 PM UTC