Christopher Mathwin
Bio
Chris Mathwin is a mechanistic interpretability researcher based in Sydney, Australia. He holds a Master of Engineering in Civil Engineering from the University of Melbourne and transitioned into AI safety research through programs including AI Safety Camp (AISC8, 2023) and the ML Alignment Theory Scholars (MATS) program, where he worked under Lee Sharkey at Apollo Research. His primary research focuses on understanding how representations are distributed across attention heads in transformer models; this work produced the 2024 paper "Gated Attention Blocks: Preliminary Progress toward Removing Attention Head Superposition" (co-authored with Dennis Akar). He has also participated in multiple mechanistic interpretability hackathons, including a top-ranked submission identifying a circuit for predicting gendered pronouns in GPT-2 Small (with Guillaume Corlouer, London EA Hub). He received a grant from the Long-Term Future Fund to support a 6-month salary for an AI Safety Camp project and continuing independent mechanistic interpretability research. He is currently a Founding Research Engineer at Harmony Intelligence, an AI safety startup focused on evaluations and red teaming.
Grants
from Long-Term Future Fund
Discussion
Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.
Details
- Last Updated
- Mar 22, 2026, 3:01 PM UTC
- Created
- Mar 20, 2026, 2:49 AM UTC