Vasil Georgiev
Bio
Vasil Georgiev is an independent AI safety researcher based in London, UK, focused on AI control and mechanistic interpretability. He participated in the MATS (ML Alignment Theory Scholars) program's Winter 2025 cohort and subsequently received funding to continue his AI control research as a MATS extension. He is a co-author of "Ctrl-Z: Controlling AI Agents via Resampling" (arXiv 2504.10374, 2025), which presents the first control evaluation in an agent environment using BashBench, a dataset of 257 system administration tasks designed to test whether safety protocols can prevent adversarial AI agents from executing malicious code. He is also a co-author of "Evidence of Learned Look-Ahead in a Chess-Playing Neural Network" (NeurIPS 2024), where he ran exploratory experiments and first identified a key attention head mechanism in Leela Chess Zero. Prior to his AI safety work, he had a software and game development career at Sports Interactive, King, Bloomberg LP, Meta, and ElevenLabs. He holds a Bachelor's degree in Software Engineering from Sofia University "St. Kliment Ohridski" (2010-2015).
Grants
from Long-Term Future Fund
Discussion
Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.
Details
- Last Updated
- Mar 23, 2026, 1:53 AM UTC
- Created
- Mar 20, 2026, 2:59 AM UTC