Joseph Bloom
Bio
Joseph Bloom is a mechanistic interpretability researcher currently leading the White Box Control Team at the UK AI Security Institute (AISI), where his team focuses on estimating and addressing risks associated with deceptive alignment in frontier AI systems. He studied computational biology and statistics at the University of Melbourne, Australia, and previously worked as a data scientist at Mass Dynamics, a proteomics startup. He became a MATS 5.0 scholar under Neel Nanda and completed the ARENA 1.0 program, during which he developed expertise in sparse autoencoders (SAEs) and mechanistic interpretability of transformer models. He created SAELens, an open-source library for training and analyzing sparse autoencoders on language models (over 1,300 GitHub stars), and served as a maintainer of the TransformerLens library (over 3,200 stars). He also co-founded Decode Research, an AI safety research infrastructure nonprofit, and helped build Neuronpedia, an open platform for hosting and analyzing sparse autoencoder features. Earlier independent work on decision transformer interpretability was funded by the Long-Term Future Fund, Manifund, and Lightspeed Grants.
Links
- Personal Website
- https://www.jbloomaus.com/
- Twitter / X
- LessWrong
- joseph-bloom
Grants
from Long-Term Future Fund
Discussion
Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.
Details
- Last Updated
- Mar 22, 2026, 10:29 PM UTC
- Created
- Mar 20, 2026, 2:53 AM UTC