TruthfulAI

Berkeley, California

early4 peopleFounded 2023

TruthfulAI, led by Owain Evans, conducts research aimed at producing safe and aligned AI systems. The organization focuses on understanding how language models develop situational awareness, engage in deception, and exhibit out-of-context reasoning. Key research contributions include TruthfulQA (a benchmark measuring model truthfulness), work on emergent misalignment, and research on subliminal learning. TruthfulAI is a fiscally sponsored project of Rethink Priorities and also mentors researchers through programs such as the Astra Fellowship and MATS.

Funding Details

Annual Budget: -
Monthly Burn Rate: -
Current Runway: -
Funding Goal: -
Funding Raised to Date: $1,171,120
Fiscal Sponsor: Rethink Priorities

Theory of Change

TruthfulAI believes that making AI systems reliably truthful and transparent is a key lever for reducing catastrophic AI risk. By developing benchmarks and conducting empirical research into how models deceive, develop situational awareness, or behave misaligned in subtle ways, the organization aims to provide the field with tools and findings that can inform safer training practices and evaluation standards. The causal chain runs from foundational research on model behavior to better evaluation methods, improved alignment techniques adopted by labs, and ultimately to AI systems that are less likely to deceive or act against human interests.

Grants Received

No grants recorded.

Projects

No linked projects.

People

No linked people.

Discussion

No comments yet. Be the first to share your thoughts.

Details

Last Updated: Apr 2, 2026, 10:08 PM UTC
Created: Mar 19, 2026, 10:30 PM UTC