Joseph Bloom
Bio
Updated 03/22/26Joseph Bloom is a mechanistic interpretability researcher currently leading the White Box Control Team at the UK AI Security Institute (AISI), where his team focuses on estimating and addressing risks associated with deceptive alignment in frontier AI systems. He studied computational biology and statistics at the University of Melbourne, Australia, and previously worked as a data scientist at Mass Dynamics, a proteomics startup. He became a MATS 5.0 scholar under Neel Nanda and completed the ARENA 1.0 program, during which he developed expertise in sparse autoencoders (SAEs) and mechanistic interpretability of transformer models. He created SAELens, an open-source library for training and analyzing sparse autoencoders on language models (over 1,300 GitHub stars), and served as a maintainer of the TransformerLens library (over 3,200 stars). He also co-founded Decode Research, an AI safety research infrastructure nonprofit, and helped build Neuronpedia, an open platform for hosting and analyzing sparse autoencoder features. Earlier independent work on decision transformer interpretability was funded by the Long-Term Future Fund, Manifund, and Lightspeed Grants.
Community Signal
Updated 03/22/26No endorsements yet.
Links
Updated 03/22/26- Personal Website
- https://www.jbloomaus.com/
- Twitter / X
- LessWrong
- joseph-bloom
- EA Forum
- -