NYU Alignment Research Group (ARG)

New York, NY

wp.nyu.edu

6 peopleFounded 2022

An academic research group at New York University doing empirical work with language models to address longer-term safety concerns about highly capable AI systems.

People

Updated 05/18/26

He He

Collaborating PI

Mengye Ren

Collaborating PI

Sam Bowman

Principal Investigator (PI)

Alex LyzhovFormer

Alumni researcher

Asa Cooper SticklandFormer

Alumni researcher

David ReinFormer

Alumni researcher

Ethan Perez

Outside advisor

Jackson Petty

PhD student

Funding Details

Updated 05/18/26

Annual Budget: -
Current Runway: -
Funding Goal: -
Funding Raised to Date: -

Org Details

Updated 05/18/26

The NYU Alignment Research Group (ARG) was founded in October 2022 by Sam Bowman, an Associate Professor of Data Science and Computer Science at New York University, with the goal of conducting rigorous empirical research on AI alignment and safety using large language models. The group grew out of Bowman's conviction that the NLP and machine learning research communities needed to take AI safety concerns more seriously, and that academic researchers were well-positioned to contribute empirical work to safety-relevant problems that had largely been studied theoretically. ARG focuses primarily on scalable oversight and related problems: how can humans reliably supervise AI systems that may eventually surpass human capabilities in many domains? The group's research agenda draws on ideas from AI safety via debate, amplification, and recursive reward modeling — approaches that attempt to use an AI system's own capabilities to bootstrap reliable human oversight on difficult tasks. Rather than studying these approaches in the abstract, ARG conducts empirical experiments using sandwiching-style protocols, where researchers pose artificial alignment challenges that require unreliable AI/ML tools to solve. The group launched with approximately a dozen researchers including research scientists (Julian Michael, David Rein, Miles Turpin, Salsabila Mahdi) and PhD students (Jackson Petty, Jacob Pfau, Jason Phang, Alex Lyzhov, Angelica Chen), with Ethan Perez as outside advisor and Tim G.J. Rudner as a Faculty Fellow. The group is based at NYU's 60 Fifth Avenue building in Greenwich Village and has access to over 30 high-end GPUs plus cloud resources through NYU's Center for Data Science and CILVR lab. Funding for ARG comes from Open Philanthropy, which supports safety-focused research initiatives of this kind. Bowman went on long-term leave from NYU to work on technical AI safety at Anthropic; as of 2024-2025, the group's website lists him as PI on leave with active PhD students Jane Pan, Jackson Petty, and Jacob Pfau continuing research, and a substantial cohort of alumni who have moved on to organizations including Anthropic, OpenAI, METR, the UK AI Safety Institute, and Scale AI. The group does not accept public donations, operating entirely through institutional and philanthropic grant funding channeled through NYU.

Theory of Change

Updated 05/18/26

ARG believes that empirical research on scalable oversight techniques — methods for maintaining reliable human supervision over AI systems even as those systems become more capable — is a tractable and high-impact contribution to long-term AI safety. The group's causal chain runs as follows: by running controlled experiments on debate, amplification, and similar protocols using current language models, researchers can identify which theoretical alignment strategies are practically feasible, surface failure modes before they manifest in more capable systems, and generate evidence that informs both academic discourse and the safety practices of AI labs. By being embedded in an academic institution with ties to the broader NLP community, ARG also plays a field-building role, encouraging more ML researchers to engage seriously with safety-relevant research directions.

Grants Received– no grants recorded

Updated 05/18/26

Projects– no linked projects

Updated 05/18/26

Discussion

No comments yet. Be the first to share your thoughts.