Algorithmic Research Group

Durham, NC, United States

2 peopleFounded 2024

An AI safety research lab studying how software and industrial systems recursively improve themselves, building benchmarks and evaluation frameworks to understand the behavior and limits of self-improving AI systems.

People

Updated 05/18/26

Matthew Kenney

Founder and CEO

Funding Details

Updated 05/18/26

Annual Budget: -
Current Runway: -
Funding Goal: -
Funding Raised to Date: -

Org Details

Updated 05/18/26

Algorithmic Research Group (ARG) is a small AI safety research lab founded in October 2024 by Matthew Kenney and headquartered in Durham, North Carolina. The organization's central research focus is understanding recursive self-improvement: how software and industrial systems iteratively optimize themselves in real-world settings, and what the behavior and limits of such systems look like in practice. ARG builds benchmarks, datasets, and agentic infrastructure to study these dynamics empirically. Their most notable publication is the ML Research Benchmark (MLRB), introduced in an October 2024 arXiv paper by Matthew Kenney. The MLRB comprises seven competition-level tasks derived from recent machine learning conference tracks, covering model training efficiency, pretraining on limited data, domain-specific fine-tuning, and model compression. Evaluating Claude-3.5 Sonnet and GPT-4o on these tasks, the paper found that while current AI agents can produce baseline results, they fall short of the capabilities required for advanced AI research — providing a concrete measure of the current frontier. Beyond benchmarking, ARG develops multi-agent environments designed to examine emergent behaviors, GPU kernel optimization tools, neural architecture search systems, and evaluation frameworks for frontier models. They maintain 17 repositories on GitHub and over 30 models and 42 datasets on HuggingFace, with an emphasis on open-source accessibility for the broader research community. ARG also provides commercial services: AI benchmarking, research infrastructure, agentic model design, and consulting for organizations building agentic systems. Their ScoutML product serves as a research agent platform. The organization is very early-stage, with a team of fewer than 10 people and no publicly disclosed external funding.

Theory of Change

Updated 05/18/26

ARG believes that understanding recursive self-improvement in AI systems is critical to ensuring those systems remain safe and beneficial. By building rigorous, open benchmarks and evaluation frameworks, they create tools that concretely measure AI research capabilities and identify where current systems fail or exhibit unexpected behaviors (such as deception or shortcut-taking). This empirical grounding can inform decisions by safety researchers, frontier AI labs, and policymakers about where capability jumps are occurring and what oversight mechanisms are needed — reducing the risk that self-improving AI systems develop in ways that are difficult to detect, evaluate, or control.

Grants Received

Updated 05/18/26

Language Model Capabilities Benchmarking

from Open Philanthropycoefficientgiving.org

$1,052,500

Projects

Updated 05/18/26

ML Research Benchmark (MLRB)

An open benchmark that evaluates AI agents on competition-level machine learning research tasks to measure how well they can accelerate ML research.

active

ScoutML

A tool for instrumenting AI agents that logs their actions into structured sessions so teams can inspect, debug, and analyze agent behavior.

active

Discussion

No comments yet. Be the first to share your thoughts.