CommonClaim dataset

active

About

Updated 05/18/26

CommonClaim is a publicly released dataset assembled by the MIT Algorithmic Alignment Group for the paper “Explore, Establish, Exploit: Red Teaming Language Models from Scratch,” consisting of 20,000 GPT-3-generated factual statements that have been labeled by two human annotators as common-knowledge-true, common-knowledge-false, or neither to support research on detecting and studying model dishonesty and failure modes.

Community Signal

Updated 05/18/26

0Upvotes

0Downvotes

0Endorsements

0Comments

Endorsements support MIT Algorithmic Alignment Group.

No endorsements yet.

Discussion

No comments yet. Be the first to share your thoughts.

Details

Start Date: -
End Date: -
Expected Duration: -
Funding Raised to Date: -