Ann-Kathrin Dombrowski
Bio
Ann-Kathrin Dombrowski is a Member of Technical Staff and Research Engineer at FAR.AI, where she focuses on explainable AI, AI transparency, and mitigating the malicious use of AI models. She holds a PhD from Technische Universität Berlin, where her research examined a geometrical perspective on counterfactual explanations and attribution methods for deep neural networks. She participated in the ML Alignment and Theory Scholars (MATS) program as a scholar under Dan Hendrycks, contributing to research on representation engineering and knowledge removal, and subsequently received LTFF funding to extend that work on internal concept extraction. She also explored information processing in large language models as a PIBBSS affiliate. Her published work includes contributions to the WMDP benchmark for measuring hazardous knowledge in AI models, safety evaluation toolkits for open-source models, and research on the manipulability of neural network explanations.
Links
- Personal Website
- -
- Twitter / X
- -
- LessWrong
- -
Grants
from Long-Term Future Fund
Discussion
Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.
Details
- Last Updated
- Mar 22, 2026, 2:16 PM UTC
- Created
- Mar 20, 2026, 2:47 AM UTC