Dan Valentine
Bio
Updated 03/22/26Dan Valentine is a Member of Technical Staff at Anthropic, where he works on AI safety and alignment research with a focus on scalable oversight. He previously worked as a full-stack software engineer in Toronto, Canada, before transitioning into technical alignment research with support from the Long-Term Future Fund. He participated in MATS Summer 2023 (cohort 4.0) under the mentorship of Ethan Perez. During and after MATS, he contributed to research on debate as a scalable oversight method, co-authoring "Debating with More Persuasive LLMs Leads to More Truthful Answers" (ICML 2024 Best Paper/Oral), which demonstrated that LLM debate helps both non-expert models and humans answer difficult questions more accurately. He is also a co-author on "Failures to Find Transferable Image Jailbreaks Between Vision-Language Models" (ICLR 2025), and contributed to earlier work on mesa-optimization using toy models (AISC8, 2023). Prior to focusing on AI safety, he studied at Dublin City University (2009-2013) and was involved in organizing the Toronto AI Safety community.
Community Signal
Updated 03/22/26No endorsements yet.
Links
Updated 03/22/26- Personal Website
- -
- Twitter / X
- -
- LessWrong
- dan-molloy
- EA Forum
- -