AI Control Research Program

active

About

Updated 05/18/26

The AI Control Research Program designs and evaluates control protocols for advanced AI systems under adversarial threat models. In their ICML oral paper "AI Control: Improving Safety Despite Intentional Subversion," Redwood researchers study techniques that use weaker trusted models to oversee stronger untrusted models, and subsequent work applies these ideas in environments such as BashArena, BashBench, and LinuxArena while informing best practices for AI labs and policymakers.

Community Signal

Updated 05/18/26

0Upvotes

0Downvotes

0Endorsements

0Comments

Endorsements support Redwood Research.

No endorsements yet.

Discussion

No comments yet. Be the first to share your thoughts.

Details

Start Date: -
End Date: -
Expected Duration: -
Funding Raised to Date: -