Joseph Bloom

London, England

Bio

Updated 03/22/26

Joseph Bloom is a mechanistic interpretability researcher currently leading the White Box Control Team at the UK AI Security Institute (AISI), where his team focuses on estimating and addressing risks associated with deceptive alignment in frontier AI systems. He studied computational biology and statistics at the University of Melbourne, Australia, and previously worked as a data scientist at Mass Dynamics, a proteomics startup. He became a MATS 5.0 scholar under Neel Nanda and completed the ARENA 1.0 program, during which he developed expertise in sparse autoencoders (SAEs) and mechanistic interpretability of transformer models. He created SAELens, an open-source library for training and analyzing sparse autoencoders on language models (over 1,300 GitHub stars), and served as a maintainer of the TransformerLens library (over 3,200 stars). He also co-founded Decode Research, an AI safety research infrastructure nonprofit, and helped build Neuronpedia, an open platform for hosting and analyzing sparse autoencoder features. Earlier independent work on decision transformer interpretability was funded by the Long-Term Future Fund, Manifund, and Lightspeed Grants.

Community Signal

Updated 03/22/26

0Upvotes

0Downvotes

0Endorsements

No endorsements yet.

Grants

Updated 03/22/26

LTFF 2022 Q4 - 2 - Joseph Bloom

from Long-Term Future Fundfunds.effectivealtruism.org

recipient$50,000

Joseph Bloom

Bio

Community Signal

Links

Grants