Steve Byrnes is an AGI safety researcher based in Boston who studies the alignment problem through the lens of neuroscience and brain algorithms. His central claim is that the most dangerous future AI systems will be 'brain-like AGI' — systems built by reverse-engineering the computational principles of human intelligence, likely based on actor-critic reinforcement learning — and that understanding how biological brains implement goals, reward, and learning is essential to solving alignment for such systems. He is a Research Fellow in the Neuro & AGI program at the Astera Institute, where he has been based since August 2022, following earlier grant funding from the CEA Donor Lottery.
Funding Details
- Annual Budget
- -
- Monthly Burn Rate
- -
- Current Runway
- -
- Funding Goal
- -
- Funding Raised to Date
- -
- Fiscal Sponsor
- -
Theory of Change
Byrnes believes that the most dangerous AI systems will be built using brain-like architectures — variants of model-based actor-critic reinforcement learning that mirror how human brains implement goals and learning. His theory of change is that by deeply studying neuroscience now and developing a rigorous technical understanding of how goal-directed cognition works in biological brains, the field can anticipate alignment problems specific to such systems before they are built, and design solutions in advance. He produces technical research and pedagogy (blog series, preprint, public writing) intended to build shared understanding in the alignment community, contribute concrete architectural proposals for safer brain-like AGI, and train future researchers to think carefully about this class of system.
Grants Received
No grants recorded.
Projects
No linked projects.
People
No linked people.
Discussion
Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.
Details
- Last Updated
- Apr 2, 2026, 9:57 PM UTC
- Created
- Mar 19, 2026, 10:30 PM UTC