Alex Cloud
Bio
Alex Cloud is an AI alignment researcher at Anthropic, affiliated with the Anthropic Fellows Program, where he focuses on developing principled methods to induce safety-relevant structure in models. He holds a PhD in Statistics from North Carolina State University (2017-2023, advised by Eric Laber) and a BA in Mathematics from Pomona College. Before entering AI safety research, he conducted applied reinforcement learning research at Riot Games AI and Amazon. His technical contributions include gradient routing to localize learning updates in neural networks, distillation for robust unlearning (NeurIPS 2025 spotlight), and co-leading the Subliminal Learning paper on how language models transmit behavioral traits via hidden signals in training data. He also serves as a mentor in the ML Alignment & Theory Scholars (MATS) program as part of Team Shard alongside Alex Turner, where he transitioned from mentee to co-mentor beginning with MATS 7.0.
Grants
No grants recorded.
Discussion
Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.
Details
- Last Updated
- Mar 22, 2026, 1:54 PM UTC
- Created
- Mar 20, 2026, 3:00 AM UTC