Dan Valentine
Bio
Dan Valentine is a Member of Technical Staff at Anthropic, where he works on AI safety and alignment research with a focus on scalable oversight. He previously worked as a full-stack software engineer in Toronto, Canada, before transitioning into technical alignment research with support from the Long-Term Future Fund. He participated in MATS Summer 2023 (cohort 4.0) under the mentorship of Ethan Perez. During and after MATS, he contributed to research on debate as a scalable oversight method, co-authoring "Debating with More Persuasive LLMs Leads to More Truthful Answers" (ICML 2024 Best Paper/Oral), which demonstrated that LLM debate helps both non-expert models and humans answer difficult questions more accurately. He is also a co-author on "Failures to Find Transferable Image Jailbreaks Between Vision-Language Models" (ICLR 2025), and contributed to earlier work on mesa-optimization using toy models (AISC8, 2023). Prior to focusing on AI safety, he studied at Dublin City University (2009-2013) and was involved in organizing the Toronto AI Safety community.
Links
- Personal Website
- -
- Twitter / X
- -
- LessWrong
- dan-molloy
Grants
from Long-Term Future Fund
Discussion
Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.
Details
- Last Updated
- Mar 22, 2026, 3:13 PM UTC
- Created
- Mar 20, 2026, 2:49 AM UTC