
NYU Alignment Research Group (ARG)
The NYU Alignment Research Group (ARG) is a set of researchers based at New York University conducting empirical AI safety research, with a focus on scalable oversight and related problems surrounding the reliable, safe, and responsible use of future broadly-capable AI systems. The group was founded in 2022 by Sam Bowman (Associate Professor of Data Science and Computer Science at NYU, currently on leave at Anthropic) and collaborating PIs He He and Mengye Ren. ARG's work explores concrete alignment strategies inspired by debate, amplification, and recursive reward modeling, using sandwiching-style experimental protocols to empirically test theoretical alignment approaches with language models. The group is affiliated with NYU's Center for Data Science, CILVR lab, and Machine Learning for Language group, and is funded by Open Philanthropy.
Funding Details
- Annual Budget
- -
- Monthly Burn Rate
- -
- Current Runway
- -
- Funding Goal
- -
- Funding Raised to Date
- -
- Fiscal Sponsor
- New York University
Theory of Change
ARG believes that empirical research on scalable oversight techniques — methods for maintaining reliable human supervision over AI systems even as those systems become more capable — is a tractable and high-impact contribution to long-term AI safety. The group's causal chain runs as follows: by running controlled experiments on debate, amplification, and similar protocols using current language models, researchers can identify which theoretical alignment strategies are practically feasible, surface failure modes before they manifest in more capable systems, and generate evidence that informs both academic discourse and the safety practices of AI labs. By being embedded in an academic institution with ties to the broader NLP community, ARG also plays a field-building role, encouraging more ML researchers to engage seriously with safety-relevant research directions.
Grants Received
No grants recorded.
Projects
No linked projects.
People
No linked people.
Discussion
Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.
Details
- Last Updated
- Apr 2, 2026, 10:00 PM UTC
- Created
- Mar 19, 2026, 10:30 PM UTC