Wilson Wu

Boulder, Colorado

Bio

Updated 03/23/26

Wilson Wu is a mathematician and AI safety researcher currently pursuing a PhD in mathematics at the University of Colorado Boulder and serving as a researcher at the Alignment Research Center (ARC), where he works on a systematic and theoretically grounded approach to mechanistic interpretability. He completed his undergraduate degree in Electrical Engineering and Computer Science at UC Berkeley. His early research involved applications of singular learning theory and compact proofs to interpretability problems, and he received LTFF funding to upskill in mathematics relevant to singular learning theory and to study neural network generalization on algorithmic tasks. He co-authored "Do language models plan ahead for future tokens?" (COLM 2024) and "Towards a unified and verified understanding of group-operation networks" (ICLR 2025), the latter of which reverse-engineers neural networks trained on finite group operations. He also serves as a mentor in the MATS Summer 2026 program under the ARC stream.

Community Signal

Updated 03/23/26

0Upvotes

0Downvotes

0Endorsements

No endorsements yet.

Grants

Updated 03/23/26

LTFF 2024 Q3 - Wilson Wu

from Long-Term Future Fundfunds.effectivealtruism.org

recipient$20,000

Wilson Wu

Bio

Community Signal

Links

Grants