FutureEval
About
Updated 05/18/26FutureEval is a continuously updated AI forecasting benchmark operated by Metaculus. It measures how accurately AI systems predict real-world outcomes across domains such as science, technology, health, geopolitics, and AI by running major models on Metaculus forecasting questions with a standardized prompt. FutureEval’s Model Leaderboard tracks model scores over time, while Bot Tournaments invite developers to submit AI forecasting systems to compete for $175,000 in annual prizes. Human baselines are provided by the Metaculus community and selected Pro Forecasters, allowing direct comparison between AI and top human forecasters. By highlighting where AI and human forecasts diverge and tracking trends in performance, FutureEval is intended to help organizations understand when AI forecasts can be trusted and how their capabilities are likely to evolve.
Theory of Change
FutureEval’s theory of change is that by rigorously benchmarking AI systems on real-world forecasting questions and comparing them against community and Pro Forecaster baselines, Metaculus can quantify when and how AI becomes reliable at probabilistic forecasting. This evidence can guide policymakers, researchers, and practitioners on when to incorporate AI forecasts into high-stakes decision-making, while also revealing where human forecasters still outperform AI and where additional safety, evaluation, or methodological work is needed.
Discussion
No comments yet. Be the first to share your thoughts.
Details
- Start Date
- Feb 17, 2026
- End Date
- -
- Expected Duration
- -
- Funding Raised to Date
- -