An academic research group at New York University doing empirical work with language models to address longer-term safety concerns about highly capable AI systems.
An academic research group at New York University doing empirical work with language models to address longer-term safety concerns about highly capable AI systems.
People
Updated 05/18/26Collaborating PI
Collaborating PI
Principal Investigator (PI)
Alumni researcher
Alumni researcher
Alumni researcher
Outside advisor
PhD student
Funding Details
Updated 05/18/26- Annual Budget
- -
- Current Runway
- -
- Funding Goal
- -
- Funding Raised to Date
- -
Org Details
Updated 05/18/26The NYU Alignment Research Group (ARG) was founded in October 2022 by Sam Bowman, an Associate Professor of Data Science and Computer Science at New York University, with the goal of conducting rigorous empirical research on AI alignment and safety using large language models. The group grew out of Bowman's conviction that the NLP and machine learning research communities needed to take AI safety concerns more seriously, and that academic researchers were well-positioned to contribute empirical work to safety-relevant problems that had largely been studied theoretically. ARG focuses primarily on scalable oversight and related problems: how can humans reliably supervise AI systems that may eventually surpass human capabilities in many domains? The group's research agenda draws on ideas from AI safety via debate, amplification, and recursive reward modeling — approaches that attempt to use an AI system's own capabilities to bootstrap reliable human oversight on difficult tasks. Rather than studying these approaches in the abstract, ARG conducts empirical experiments using sandwiching-style protocols, where researchers pose artificial alignment challenges that require unreliable AI/ML tools to solve. The group launched with approximately a dozen researchers including research scientists (Julian Michael, David Rein, Miles Turpin, Salsabila Mahdi) and PhD students (Jackson Petty, Jacob Pfau, Jason Phang, Alex Lyzhov, Angelica Chen), with Ethan Perez as outside advisor and Tim G.J. Rudner as a Faculty Fellow. The group is based at NYU's 60 Fifth Avenue building in Greenwich Village and has access to over 30 high-end GPUs plus cloud resources through NYU's Center for Data Science and CILVR lab. Funding for ARG comes from Open Philanthropy, which supports safety-focused research initiatives of this kind. Bowman went on long-term leave from NYU to work on technical AI safety at Anthropic; as of 2024-2025, the group's website lists him as PI on leave with active PhD students Jane Pan, Jackson Petty, and Jacob Pfau continuing research, and a substantial cohort of alumni who have moved on to organizations including Anthropic, OpenAI, METR, the UK AI Safety Institute, and Scale AI. The group does not accept public donations, operating entirely through institutional and philanthropic grant funding channeled through NYU.
Theory of Change
Updated 05/18/26ARG believes that empirical research on scalable oversight techniques — methods for maintaining reliable human supervision over AI systems even as those systems become more capable — is a tractable and high-impact contribution to long-term AI safety. The group's causal chain runs as follows: by running controlled experiments on debate, amplification, and similar protocols using current language models, researchers can identify which theoretical alignment strategies are practically feasible, surface failure modes before they manifest in more capable systems, and generate evidence that informs both academic discourse and the safety practices of AI labs. By being embedded in an academic institution with ties to the broader NLP community, ARG also plays a field-building role, encouraging more ML researchers to engage seriously with safety-relevant research directions.
Grants Received– no grants recorded
Updated 05/18/26Projects– no linked projects
Updated 05/18/26Discussion
No comments yet. Be the first to share your thoughts.