Joshua Clymer

Bio

Updated 03/22/26

Joshua Clymer is a technical AI safety researcher at Redwood Research, where he specializes in safety evaluation methodologies for advanced AI agents. Prior to Redwood Research, he researched AI threat models and developed evaluations for self-improvement capabilities at METR. He received a $1,500 Long-Term Future Fund grant for compute resources to develop an instruction-following generalization benchmark, which resulted in the GENIES (GENeralization analogIES) benchmark and a paper demonstrating that reward models do not learn to evaluate instruction-following by default and instead favor personas resembling internet text. He is also known for the Poser paper, which introduced a benchmark for detecting alignment-faking LLMs by manipulating model internals, achieving a 98% detection rate. Clymer co-authored a widely cited safety cases report and has contributed to work on AI control, scheming evaluations, and international agreement verification. He founded Dioptra, a volunteer research group building evals for AI safety, and was among the early signatories of the CAIS Statement on AI Risk. He is affiliated with the Cambridge Boston Alignment Initiative as a mentor.

Community Signal

Updated 03/22/26

0Upvotes

0Downvotes

0Endorsements

No endorsements yet.

Grants

Updated 03/22/26

LTFF 2023 Q2 - Joshua Clymer

from Long-Term Future Fundfunds.effectivealtruism.org

recipient$1,500

Joshua Clymer

Bio

Community Signal

Links

Grants