CommonClaim dataset
active
About
Updated 05/18/26CommonClaim is a publicly released dataset assembled by the MIT Algorithmic Alignment Group for the paper “Explore, Establish, Exploit: Red Teaming Language Models from Scratch,” consisting of 20,000 GPT-3-generated factual statements that have been labeled by two human annotators as common-knowledge-true, common-knowledge-false, or neither to support research on detecting and studying model dishonesty and failure modes.
Community Signal
Updated 05/18/26Endorsements support MIT Algorithmic Alignment Group.
No endorsements yet.
Discussion
Sign in to comment
No comments yet. Be the first to share your thoughts.
Details
- Start Date
- -
- End Date
- -
- Expected Duration
- -
- Funding Raised to Date
- -