Lee Sharkey
Bio
Lee Sharkey is a mechanistic interpretability researcher and Principal Investigator at Goodfire AI, based in London. He co-founded Apollo Research, where he served as Chief Strategy Officer, and previously worked as a Research Engineer at Conjecture. His academic background spans preclinical medicine and neuroscience at the University of Cambridge, an MSc in Data Analytics from the University of Glasgow, and an MSc in Neural Systems and Computation from the University of Zurich and ETH Zurich; he also worked in international public health before transitioning to AI research. He is best known for early foundational work on sparse autoencoders (SAEs) as a solution to representational superposition in neural networks, and more recently has developed Attribution-based Parameter Decomposition (APD) and Stochastic Parameter Decomposition (SPD) as improved approaches to reverse-engineering neural network mechanisms. He is the lead author of the 2025 paper "Open Problems in Mechanistic Interpretability," a comprehensive review published in TMLR co-authored with approximately 30 researchers. He also mentors scholars in the MATS program and has contributed key alignment research including work on goal misgeneralization in deep reinforcement learning.
Links
- Personal Website
- https://leesharkey.github.io/
- Twitter / X
- LessWrong
- lee_sharkey
Grants
from Long-Term Future Fund
Discussion
Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.
Details
- Last Updated
- Mar 22, 2026, 10:52 PM UTC
- Created
- Mar 20, 2026, 2:54 AM UTC