Jacob Goldman-Wetzler
Bio
Jacob Goldman-Wetzler is an undergraduate student at Brown University and an AI safety researcher, with a research position at Anthropic. He was a fellow at ML Alignment & Theory Scholars (MATS), where he co-developed Gradient Routing, a training method that applies data-dependent masks to gradients during backpropagation to isolate neural network capabilities to specific subregions. This work, published on arXiv in October 2024, has applications for mechanistic interpretability, robust machine unlearning, and scalable oversight of reinforcement learners. He subsequently co-authored follow-up research on knowledge localization for capability removal in large language models and on robustifying unlearning via distillation. Earlier in his career he contributed to research on mixed-precision training for scientific machine learning. He is an active contributor to the LessWrong and Alignment Forum communities under the handle g-w1.
Links
- Personal Website
- https://jacobgw.com/
- Twitter / X
- -
- LessWrong
- g-w1
Grants
No grants recorded.
Discussion
Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.
Details
- Last Updated
- Mar 22, 2026, 4:30 PM UTC
- Created
- Mar 20, 2026, 3:00 AM UTC