Adelin Kassler
Bio
Adelin Kassler is an AI safety researcher based in the Greater Boston area, focusing on detecting and mitigating deceptive behavior in large language models. She was a PhD candidate in Bioinformatics and Integrative Genomics at Harvard University under advisor Debora Marks, where she worked on machine learning tools for protein design and genomics, ultimately graduating with a master's degree after qualifying for the PhD program in order to pursue AI safety research. Prior to that she held research positions at the Alkes Price Lab at Harvard School of Public Health and at the Hilary Finucane Lab at the Broad Institute of MIT and Harvard. She participated in the ML Alignment & Theory Scholars (MATS) program under Evan Hubinger at Anthropic. Her key AI safety contribution is the paper "Getting Models Drunk: Noise Injection Reveals Backdoor Behavior in LLM Sleeper Agents" (co-authored with Evan Hubinger, 2024), which develops weight-noise-injection as a method to surface hidden deceptive behaviors in language models prior to deployment. This work was supported by a grant from the Long-Term Future Fund.
Links
- Personal Website
- -
- Twitter / X
- -
- LessWrong
- -
Grants
from Long-Term Future Fund
Discussion
Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.
Details
- Last Updated
- Mar 22, 2026, 1:44 PM UTC
- Created
- Mar 20, 2026, 2:46 AM UTC