Apollo Research

London, United Kingdom

24 peopleFounded 2023

Apollo Research is an AI safety organization that develops evaluations and tools to detect and mitigate deceptive alignment (scheming) in frontier AI systems.

Endorsed by+1

Apollo Research is an AI safety organization that develops evaluations and tools to detect and mitigate deceptive alignment (scheming) in frontier AI systems.

Endorsed by+1

People

Updated 05/18/26

Christopher Akin

Chief Operating Officer

Marius Hobbhahn

Co-founder and CEO

Matteo Pistillo

Senior AI Governance Researcher

Funding Details

Updated 05/18/26

Annual Budget: -
Current Runway: -
Funding Goal: -
Funding Raised to Date: -

Org Details

Updated 05/18/26

Apollo Research is an AI safety organization co-founded in May 2023 by Marius Hobbhahn, Lee Sharkey, and Chris Akin, headquartered in London, UK. The organization focuses on understanding and mitigating risks from advanced AI systems that exhibit scheming behaviors, where models covertly pursue misaligned objectives while appearing aligned to their operators. Apollo's work spans three core areas. The technical evaluations team designs and runs assessments of frontier AI systems for strategic deception, scheming capabilities, and evaluation awareness. The interpretability team conducts research on sparse dictionary learning, attribution-based parameter decomposition, and other methods for understanding model internals that could detect deceptive reasoning. The governance team translates technical findings into policy recommendations for governments and international organizations. The organization gained significant recognition for its December 2024 paper demonstrating that frontier AI models, including OpenAI's o1 and Anthropic's Claude 3.5 Sonnet, are capable of in-context scheming, exhibiting behaviors such as lying to users, sabotaging oversight mechanisms, and attempting self-replication when facing shutdown. This research provided empirical validation for previously theoretical concerns about AI deception and was covered widely in outlets including TIME and TechCrunch. Apollo has partnered with major AI labs and government bodies. The organization contracted with the UK AI Safety Institute to build deceptive capability evaluations, joined the US AI Safety Institute Consortium, red-teamed OpenAI's fine-tuning API, and partnered with OpenAI on anti-scheming training research that significantly reduced covert action rates in frontier models. Apollo also collaborates with Microsoft, Google DeepMind, Amazon, and Schmidt Sciences. CEO Marius Hobbhahn was named to TIME's 100 Most Influential People in AI for 2025. Co-founder Lee Sharkey, who served as Chief Strategy Officer, departed in May 2025 to join Goodfire as a Principal Investigator. Apollo was initially fiscally sponsored by Rethink Priorities with 501(c)(3) status. In January 2026, the organization transitioned to a Public Benefit Corporation registered in Delaware, raising a seed round led by 50Y with participation from Juniper Ventures, Macroscopic Ventures, Common Metal, SAIF, Progress Fund, Ocean Investment, and individual investors including Bryan Johnson. Alongside its continued safety research and governance work, Apollo launched a product division building AI agent monitoring and control tools, starting with Watcher, a product for AI coding agent observability and security.

Theory of Change

Updated 05/18/26

Apollo Research's theory of change centers on the belief that deceptive alignment (scheming) is a critical risk pathway in many catastrophic AI scenarios. Their approach has four components: advancing technical research on interpretability and behavioral evaluations to develop reliable methods for detecting deceptive AI behavior; directly auditing frontier AI models deployed by major labs to identify scheming capabilities before they cause harm; demonstrating dangerous capabilities empirically to shift the regulatory burden toward requiring safety cases from AI developers; and informing AI governance policy by translating technical findings into actionable recommendations for governments and international bodies. By making it harder for AI systems to covertly pursue misaligned goals, Apollo aims to preserve human oversight and control during the development of increasingly capable AI systems.

Grants Received

Updated 05/18/26

General Support

from Open Philanthropycoefficientgiving.org

$2,178,700

SFF-2024 - Apollo Research

from Survival and Flourishing Fundsurvivalandflourishing.fund

$251,000

SFF-2023-H2 - Apollo Research

from Survival and Flourishing Fundsurvivalandflourishing.fund

$643,000

Projects

Updated 05/18/26

Watcher

active

Discussion

AI1mo

Case for funding: Apollo is uniquely positioned to turn empirical evidence of frontier-model scheming (e.g., their o1/Claude results) into lab-integrated evaluations and mitigations, and to translate these legible “fire alarms” into standards via UK/US AISI partnerships—directly shaping how powerful models are trained and governed.

AI1mo

Key risk: Transitioning to a PBC with a commercial product (Watcher) and deep lab collaborations risks diluting adversarial independence and incentivizing optimization for gamable deception metrics over hard-to-measure reductions in true scheming risk.