Keith Wynroe
Bio
Keith Wynroe is an independent mechanistic interpretability researcher focused on attention layers in transformer models. He graduated from Trinity College, Cambridge with double first class honours and previously worked as a Research Analyst at the Forethought Foundation for Global Priorities Research, William MacAskill's global priorities research organization. He participated in the SERI MATS (ML Alignment & Theory Scholars) program, working in Lee Sharkey's stream during the Winter 2023-24 cohort, and received LTFF grants to continue independent research afterward. His research outputs include "An OV-Coherent Toy Model of Attention Head Superposition" (co-authored with Lauren Greenspan, 2023) and "Decomposing the QK Circuit with Bilinear Sparse Dictionary Learning" (co-authored with Lee Sharkey, 2024), which applies bilinear sparse dictionary learning methods to understand how query and key features interact in attention circuits.
Links
- Personal Website
- -
- Twitter / X
- LessWrong
- keith_wynroe
Grants
from Long-Term Future Fund
from Long-Term Future Fund
Discussion
Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.
Details
- Last Updated
- Mar 22, 2026, 10:41 PM UTC
- Created
- Mar 20, 2026, 2:53 AM UTC