Dmitrii Krasheninnikov
Bio
Dmitrii (Dima) Krasheninnikov is an AI safety researcher who completed his PhD in machine learning at the University of Cambridge in December 2025, supervised by David Krueger and Rich Turner, and subsequently joined Anthropic. He holds an MSc in AI from the University of Amsterdam (cum laude) and previously held research positions at UC Berkeley's Center for Human-Compatible AI and Sony AI Zurich. His research spans interpretability, the science of deep learning, control, and security, with a focus on ensuring advanced AI systems remain aligned with human values. He is known for coining the term "out-of-context learning" and for demonstrating that language models linearly encode the training-order of facts in their activations. He also co-authored "Defining and Characterizing Reward Hacking" (NeurIPS 2022) and has published work at NeurIPS 2024/2025, ICML 2024, and ICLR 2026. He has received funding from the Long-Term Future Fund for his PhD research in AI alignment.
Links
- Personal Website
- https://krasheninnikov.github.io/
- Twitter / X
- LessWrong
- dmitrii-krasheninnikov
Grants
from Long-Term Future Fund
Discussion
Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.
Details
- Last Updated
- Mar 22, 2026, 3:44 PM UTC
- Created
- Mar 20, 2026, 2:50 AM UTC