No summary available yet.
Database
Loading results...
Loading results...
Showing 1201-1250 of 4525 results
No summary available yet.
No summary available yet.
AI Safety impact - research, education, coordination
No summary available yet.
No summary available yet.
No summary available yet.
No summary available yet.
Proving Computational Hardness of Verifying Alignment Desirata
No summary available yet.
Support for AI alignment outreach in France (video/audio/text/events) & field-building
This grant will support Daniel Filan in producing 18 episodes of AXRP, the AI X-risk Research Podcast. The podcast aims to increase in-depth understanding of potential risks from artificial intelligence.

Rusheb Shah is a Research Engineer at Apollo Research, an AI safety organization focused on evaluating and auditing high-risk failure modes in frontier AI systems. He holds a Master's degree in Materials Science from the University of Oxford and completed the Alignment Research Engineer Accelerator (ARENA) program to transition into technical AI safety work. Before joining Apollo Research in December 2023, he briefly worked at OpenAI and previously held software engineering roles at R3, Brainlabs, and Amazon Web Services. His research at Apollo Research focuses on LLM evaluations, including co-authoring work on evaluations-based safety cases for AI scheming and research on scalable black-box jailbreaks via persona modulation. He also contributed to the mechanistic interpretability library TransformerLens by adding BERT support and won first prize at the ARENA Interpretability Hackathon for work on circuit discovery algorithms.
Researcher in AI and Infectious Diseases
Jeremy Gillen is an independent AI alignment researcher based in Berkeley, California, working primarily on agent foundations and the ontology identification problem. He holds an undergraduate degree in Computer Science and Neuroscience with a thesis on statistical learning theory. He participated in the SERI MATS (ML Alignment Theory Scholars Program) cohort 2 under mentor John Wentworth, where he co-authored "Finding Goals in the World Model" — a proposal for aligning model-based RL systems by identifying human values in a world model and using inverse reinforcement learning to guide the policy. Following MATS, he received a Long-Term Future Fund grant to continue independent research on alignment problems in model-based RL. He subsequently joined Vivek Chan's team at MIRI (Machine Intelligence Research Institute) before returning to independent research. His current work focuses on the ontology identification problem and related natural abstractions research, with recent co-authored work on condensation and natural latents. He is an active contributor to LessWrong and the AI Alignment Forum, and has participated in public debates on AI corrigibility.
No summary available yet.
No summary available yet.
Petr Salaba is a documentary filmmaker and science communicator focused on AI alignment. His recent work includes Youtube video essays for the Dwarkesh Patel Podcast.
No summary available yet.
Executive in the nonprofit sector
James Norris is the Founder and Executive Director of the Center for Existential Safety and the Founder and CEO of Upgradable and Survival Sanctuaries, with over 20 years of experience in social entrepreneurship, behavior change, and systems innovation. He has co-founded or helped build 27 organizations, including the international conference for the effective altruism movement.

Independent writer and blogger based in NYC, best known for his Substack "Don't Worry About the Vase" (thezvi.substack.com). Which has over 32,000 subscribers and focuses primarily on AI developments, policy, and rationality. He holds a bachelor's degree in mathematics from Columbia University and was a professional Magic: The Gathering player, inducted into the Magic Hall of Fame in 2007. He previously co-founded and served as CEO of MetaMed, a medical research analysis firm, and has worked at Jane Street Capital. He is a board member of the Center for Applied Rationality and founded Balsa Research, a nonprofit policy think tank focused on evidence-based regulatory reform. His writing covers AI safety, AI policy, government regulation, economics, and strategic thinking, and he is a prominent figure in the rationalist community on LessWrong.
No summary available yet.
No summary available yet.
A leading private research university on Chicago's South Side that hosts several AI safety and existential risk research programs, including the Existential Risk Laboratory (XLab), the Chicago Human+AI Lab, and the Harris School's Technology and Society Initiative.
No summary available yet.
Samuel Hammond is director of Artificial Intelligence Policy and chief economist at the Foundation for American Innovation, where his research focuses on innovation policy, artificial intelligence, and the institutional impact of emerging technologies. He previously served as director of social policy at the Niskanen Center, worked as an economist for the Government of Canada, and was a graduate research fellow at the Mercatus Center.
No summary available yet.
Earendil is a hardware security startup that builds tamper response systems for AI compute infrastructure, including GPU clusters, to support hardware-enabled governance and compliance verification for AI development.
Bryce Hidysmith was previously a designer and strategist for the hedge fund Numerai and writes essays at hidysmith.com.
Executive Director of CEEALAR, where he is leading its evolution into a high-impact EA hub. He brings experience in R&D leadership at global organisations, the startup ecosystem, education, customer service and military service, and focuses on creating communities where talented people can do impactful work.
No summary available yet.
No summary available yet.
Early contributor to the AI Safety Awareness Project remembered in its "In Memoriam" section. He previously played online poker professionally for five years, spent four months at the Recurse Center focusing on AI and reinforcement learning, received a grant from the Machine Intelligence Research Institute (MIRI) to upskill in machine learning, and participated in AI Safety Camp as both participant and organizer. He later founded Poker Camp, a nonprofit that uses poker to teach probability, AI, and decision-making under uncertainty. He held a B.S. in Electrical Engineering from Northwestern University and an M.Sc. in Electrical Engineering from the Technion – Israel Institute of Technology.
Extending an AI control evaluation to include vulnerability discovery, weaponization, and payload creation
No summary available yet.
No summary available yet.
No summary available yet.
José Luis Cordeiro is a Venezuelan-Spanish engineer, economist, and futurist who chairs The Millennium Project’s Venezuela Node and serves on its Board of Directors. He founded the World Future Society Venezuela, previously directed the Venezuela Chapter of the Club of Rome, holds degrees from Universidad Simón Bolívar, INSEAD, and MIT, and has authored more than ten books, including the bestseller La muerte de la muerte on radical life extension.
Eric Easley is an AI safety researcher and machine learning researcher affiliated with the ML Alignment & Theory Scholars (MATS) program, based in San Francisco, CA. He holds a Bachelor's degree in Physics and Economics from The Ohio State University. Prior to focusing on AI alignment research, he gained software engineering experience at companies including PullRequest and Coursera. He received a grant from the Long-Term Future Fund (LTFF) for a 6-month research stipend focused on removing conditional bad behaviors from large language models via a learned latent space intervention, exploring how latent representations in LLMs can be used to durably eliminate undesirable model behaviors rather than merely suppress them. His GitHub profile (username: pseudonom) shows a background in functional programming, particularly Haskell.
Design and product specialist working at Atlas Computing.

Adam Shimi is a French AI safety researcher currently working as a Policy Researcher at ControlAI, a non-profit focused on preventing the development of unsafe superintelligence, based in London. He completed his PhD in theoretical computer science at the Université de Toulouse (IRIT) in 2020, where his dissertation focused on distributed computing and the Heard-Of model, and holds an engineering degree in Computer Science and Applied Mathematics from ENSEEIHT (2014-2017). Prior to ControlAI, he was a co-founder and early staff member at Conjecture, an AI alignment research startup, where he also founded Refine, a three-month incubator for conceptual alignment researchers to develop new research directions. He has conducted independent AI alignment research with support from the Long-Term Future Fund and is an active contributor to the Alignment Forum and LessWrong, with over 124 posts and 875 comments. His research interests span epistemology of alignment, agent foundations, goal-directedness, and the methodology of AI safety research, and he blogs at For Methods (formethods.substack.com).
No summary available yet.
No summary available yet.
Romain Graux is a systems specialist and machine learning engineer at EPFL’s Computational Molecular Design Laboratory who writes about AI safety, deep learning, and software. Public profiles indicate that he works at EPFL in a systems specialist role, and event listings show him helping to host and organize AI safety events at EPFL, including the AI Horizons panel on the Swiss AI Initiative and earlier multi-agent safety hackathons.
B.S. Behavioral Economics; Biz dev lead and PM for various early-stage tech startups, mostly in the Medtech and AR/VR space.
Funding top-up for an early-career reseacher to attend Global Challenges Project (GCP) Workshop for career exploration in mitigating GCRs
344 MIT rules merged into Microsoft Agent Governance Toolkit, Cisco AI Defense, MISP, OWASP. Microsoft Copilot SWE Agent uses ATR for CVE triage.