Joe Kwon
Bio
Updated 03/22/26Joe Kwon is an AI safety researcher and policy analyst based in Washington, DC. He holds a BS in Computer Science and Psychology from Yale University and has conducted research at MIT's Computational Cognitive Science Lab, where he studied moral and social cognition with Josh Tenenbaum and Sydney Levine. His technical background includes early RLHF work at OpenAI, empirical ML research at UC Berkeley with Jacob Steinhardt and Dan Hendrycks focused on evals and out-of-distribution detection, and a stint as a Research Engineer at LG AI Research working on multilingual large language models. He subsequently transitioned to AI governance work, completing a GovAI DC Fellowship focused on risks from internal AI deployment and automated R&D, and serving as a Technical Policy Analyst at the Center for AI Policy (CAIP). Most recently he has been an Astra Fellow working with Tom Davidson and Fabien Roger on threat modeling and ML experiments related to secretly loyal AI. He received a Long-Term Future Fund grant as a stipend to work on an ML safety project with the goal of joining an ML safety team full-time.
Community Signal
Updated 03/22/26No endorsements yet.
Links
Updated 03/22/26- Personal Website
- https://www.joe-kwon.com/
- Twitter / X
- LessWrong
- joe-kwon
- EA Forum
- -