SDCPNs for AI Safety
Developing correct-by-construction world models for verification of frontier AI
Loading results...
Showing 1351-1400 of 3954 results
Clear filtersDeveloping correct-by-construction world models for verification of frontier AI
Showing 1351-1400 of 3954 results
Active filters: Type: Individual, Project
Clear filters to view everything →Surveying experts on AI risk scenarios and working on other projects related to AI safety.
Education and public health entrepreneur
Sunishchal Dev is an AI evaluation research scientist at RAND, where his work focuses on evaluating and governing artificial intelligence systems. Before joining RAND, he spent about a decade in industry as a data scientist and management consultant implementing AI solutions, and he holds a B.A. in technology and innovation management from the University of Washington Bothell.
No summary available yet.
No summary available yet.
Co-Founder of the AI Safety Initiative at Georgia Tech and fourth-year computer science student who also helped co-found the Effective Altruism club at Georgia Tech and is interested in AI safety field-building and technical work to mitigate long-term risks from transformative AI.
Katerina Manoli is a researcher with Sentience Institute who has a background in philosophy, psychology, and neuroscience and is a PhD student at the Max Planck Institute for Human Cognitive and Brain Sciences in Leipzig, Germany. She has conducted research on the perception of artificial agents and the ethics of human AI interaction at institutions including the Donders Institute for Brain, Cognition and Behaviour, Harvard Universitys Department of Social Studies, and the University of Glasgows School of Psychology and Neuroscience, and has served on boards of effective altruism organizations and as a STEM mentor for Effective Thesis.
Michael Hanna is a PhD student at the Institute for Logic, Language and Computation at the University of Amsterdam whose research focuses on interpreting and evaluating NLP models. As an Anthropic Fellow he led the development of the circuit-tracer library for feature circuits, which he now maintains as an open-source project integrated with Neuronpedia and supported by Decode Research.
Kunvar Thaman is a machine learning research engineer at Standard Intelligence in San Francisco. He studied at the Birla Institute of Technology and Science (BITS), Pilani. His work focuses on training large-scale neural networks and mechanistic interpretability research — reverse-engineering the internal computations of neural networks to understand how they work. He co-authored "Benchmark Inflation: Revealing LLM Performance Gaps Using Retro-Holdouts," presented at ICML 2024, which introduces a methodology called retro-holdouts to measure how much public benchmark scores are inflated by training data contamination. He also participated in AI Safety Camp (AISC9) where his team applied mechanistic interpretability methods to study out-of-context learning in neural networks. He maintains a research site at mechinterp.com and writes at kunvarthaman.com.
Federico L.G. Faroldi is a professor of Ethics, Law and Artificial Intelligence at the University of Pavia, where he directs the Center for Reasoning, Normativity and AI (CERNAI) and the Normative Risk Lab and holds a 1.2m € FIS grant on "generic reasoning". He is also affiliate faculty at UC Berkeley's Center for Human-Compatible AI (CHAI). His current research focuses on AI safety and risk (especially regulation and governance), AI alignment using normative reasons, deontic and modal logic, truthmaker semantics, and the law and ethics of artificial intelligence.
Executive coach, writer, and serial social entrepreneur; founder and Executive Director of Original Position and creator of What Future World?, and previously Executive Director of The AI Governance Archive (TAIGA) as well as co-founder of LifeWork and the Small Business School Challenge.
No summary available yet.
No summary available yet.
No summary available yet.
No summary available yet.
No summary available yet.
No summary available yet.
6-month salary to build & enhance open-source mechanistic interpretability tooling for AI safety researchers
No summary available yet.
No summary available yet.
No summary available yet.
No summary available yet.
No summary available yet.
No summary available yet.
4-6 month salary to do circuit-based mech interp on Mamba, as part of the MATS extension program
No summary available yet.
Philippe Rivet is a Canadian electrical engineer based in Toronto, Ontario, who has pursued applied technical AI alignment research. He received an NSERC Undergraduate Student Research Award during his undergraduate studies and later completed studies related to machine learning and AI safety, including coursework from Dan Hendryck's ML Safety program. His GitHub profile reflects interests in evolutionary computing, reinforcement learning, interpretability, and AI alignment. He has described his path as a "costly attempt at AI safety research" and has since considered earning-to-give as an alternative contribution to effective altruism. He received a $10,000 grant in support of applied technical AI alignment research. He is active in the effective altruism community, having published a post on the EA Forum titled "The Highly Sensitive EA," and maintains interests in contemplative practice and philosophy.
Independent researcher at the Institute of Computer Science (ICC), CONICET‑UBA, with an Electrical Engineering PhD from Stanford University; her main research area is machine learning applied to speech and language processing.
No summary available yet.
No summary available yet.

Alexis Carlier is the co-founder and CEO of Asymmetric Security, an AI-native digital forensics and incident response company that emerged from stealth in early 2026 with $4.2M in pre-seed funding. He was previously part of the founding team at the Centre for the Governance of AI (GovAI), where he served as Head of Strategy and contributed to the organization's transition from Oxford to an independent nonprofit. He also led AI security programs at RAND and the University of Oxford. Carlier holds a Master's degree in Economics from the Toulouse School of Economics. He received a Long-Term Future Fund grant in 2020 to survey leading AI safety and governance researchers on their beliefs about AI existential risk scenarios, work that was later published on the EA Forum and LessWrong co-authored with Sam Clarke and Jonas Schuett. He has also co-authored academic papers on AI governance topics including AI ethics board design and frontier AI regulation.
Grant to cover fees for a master's program in machine learning
No summary available yet.
No summary available yet.

Neil Crawford is a PhD student in Logic and Philosophy of Science at UC Irvine whose research focuses on evolutionary game theory and its implications for AI safety. He has been involved in the Effective Altruism community since his undergraduate studies at the London School of Economics (2018) and has been a central figure in building the AI safety community at UC Irvine. He organized AI Alignment Irvine (AIAI), facilitating weekly alignment reading groups and AI safety dinners, and helped recruit a core group of PhD students engaged in AI alignment research. He also facilitated the Arete Fellowship at UCI, an 8-week introduction to Effective Altruism, and co-organized EA at UC Irvine. He has received grants from EA Funds (Long-Term Future Fund) to support his organizing work, including a stipend for running AIAI and funding for AI safety dinners. He is listed as a member of the AI Safety Community Researchers group at the Future of Life Institute.
No summary available yet.
Lead Researcher at Saving Humanity from Homo Sapiens, investigating how humanity can achieve very-long-term sustainability and a flourishing long-term future while avoiding global catastrophic and existential risks.

Matthias Georg Mayer is an independent researcher and mathematician working in AI safety, with a focus on agent foundations and providing formal guarantees for the safety of AI systems. He was a 2025 Alumni Fellow at Principles of Intelligence (PIBBSS), where he worked on theoretical alignment research. His earlier work centered on structural independence, a generalization of d-separation concepts applied to structural causal models (also known as Finite Factored Sets), and he co-authored the paper "Factored space models: Towards causality between levels of abstraction" (arXiv 2412.02579, December 2024) alongside Scott Garrabrant, Magdalena Wache, Leon Lang, Sam Eisenstat, and Holger Dell. More recently his interests have shifted toward the Learning Theoretic Agenda developed by Vanessa Kosoy, with particular focus on Infrabayesian-Physicalism as a means to address embedded agency. He has received funding from the Long-Term Future Fund (LTFF) to research framing computational systems in ways that surface meaningful concepts. He participates in the AI alignment community through the Alignment Forum and LessWrong under the handle matthias-g-mayer.
Remmelt Ellen is a Dutch AI safety organizer and researcher based in Amsterdam, Netherlands. He co-founded Effective Altruism Netherlands in 2017 and co-launched AI Safety Camp (AISC) in 2018, serving as an organizer for nearly every edition of the program. At AISC he oversees program design and currently coordinates Stop/Pause AI projects, working to onboard initiatives that advocate for halting or pausing unsafe AI development. His research focuses on AGI safety impossibility arguments, including formal reasoning on why advanced AI systems cannot be reliably controlled, developed in collaboration with a former Pentagon engineer. He authored the book "Artificial Bodies: How Machines Replace People" (2024), a critical examination of how Big Tech corporations create exploitative systems that consume human resources and disrupt ecological balance. He has received grants from the Long-Term Future Fund to support AI Safety Camp's virtual and in-person programs.
No summary available yet.
No summary available yet.
No summary available yet.
Assistant Professor in the Department of Linguistics and affiliate faculty in the Department of Computer Science at Boston University, where she leads research at the intersection of computational semantics, pragmatics, and generalization in human and machine learners. She received her PhD in Cognitive Science from Johns Hopkins University, and her work has been supported by funders including the NSF, Google, and Open Philanthropy.
Making a feature-length documentary on AI. I communicate risks from AI to the public.
No summary available yet.
No summary available yet.
6 month salary & operational expenses to start a cybersecurity & alignment risk assessment org