Paul Christiano's AI alignment blog, hosted at ai-alignment.com, is a Medium publication where he published technical writing on AI safety, alignment, and control from roughly 2015 through the early 2020s. Topics include iterated amplification, reinforcement learning from human feedback (RLHF), eliciting latent knowledge, scalable oversight, and the conceptual foundations of AI alignment. Christiano is one of the most influential figures in AI alignment research, having co-developed RLHF at OpenAI and founded the Alignment Research Center. The blog is no longer actively updated but remains an important archive of foundational alignment thinking.
Funding Details
- Annual Budget
- -
- Monthly Burn Rate
- -
- Current Runway
- -
- Funding Goal
- -
- Funding Raised to Date
- -
- Fiscal Sponsor
- -
Theory of Change
Christiano's theory of change centers on developing technical solutions to AI alignment in advance of transformative AI systems being deployed. His approach involves creating scalable oversight methods — ways for humans to supervise AI systems even when those systems become more capable than their overseers. Key mechanisms include iterated amplification (using AI assistance to extend human oversight capacity), RLHF (using human feedback signals to train models toward desired behavior), and eliciting latent knowledge (getting AI systems to honestly report what they know rather than what they expect humans to want to hear). The blog served as a venue to develop and communicate these ideas to the research community, seeding further work by other alignment researchers.
Grants Received
No grants recorded.
Projects
No linked projects.
People
No linked people.
Discussion
Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.
Details
- Last Updated
- Apr 2, 2026, 9:53 PM UTC
- Created
- Mar 19, 2026, 10:30 PM UTC
