6-month salary to develop tools to test the natural abstractions hypothesis
Database
Loading results...
Loading results...
Showing 2901-2950 of 4518 results
6-month salary to develop tools to test the natural abstractions hypothesis
Founder of Aether, an independent research lab focused on foundation model agent safety, and PhD student in computer science at the University of Toronto (currently on leave), working on AI safety and LLM agents.
Liron Shapira is an entrepreneur and founder/CEO of Relationship Hero, a dating and relationship coaching platform, and host of DoomDebates.com.
A directory within AISafety.com that consolidates free guidance calls from AI safety advisors, helping newcomers identify how to contribute most effectively to the field.
Wes Ezzeddine is a Dubai-based technology leader and AI safety specialist with over 15 years of experience leading engineering teams across fintech, insurtech, mobility, and edtech. He has served as Director of Engineering at Mamo Pay, co-founded AI Safety UAE, and works on AI alignment, governance, and resilient systems while facilitating AI safety courses and events.
No summary available yet.

Charlie Griffin is a doctoral student in Computer Science at the University of Oxford, affiliated with Green Templeton College and supervised by Alessandro Abate and Marta Kwiatkowska. His research focuses on AI control — developing formal frameworks and safety evaluations for containing misaligned AI systems. He is affiliated with the UK AI Security Institute (AISI) and FAR.AI, where he has contributed research on AI-control games and untrusted monitoring. His notable publications include "Games for AI Control: Models of Safety Evaluations of AI Deployment Protocols" (with Buck Shlegeris and others) and "When can we trust untrusted monitoring? A safety case sketch across collusion strategies" (with researchers from Google DeepMind and AISI). He helped organize the first round of LASR Labs (London AI Safety Research) and continues as an advisor, and has received funding from the Long-Term Future Fund for alignment work including skilling up, assisting academics, personal research, and community building.
A leading U.S. law school that conducts research on AI governance, policy, and safety through its PULSE program and Institute for Technology, Law & Policy.
Enabling prosaic alignment research with a multi-modal model on natural language and chess
Professor Rafael A. Calvo is Professor at the Dyson School of Design Engineering at Imperial College London and co‑lead of the Imperial College spoke of the Leverhulme Centre for the Future of Intelligence. His research focuses on the design of digital technologies that support psychological wellbeing, mental health and education, and on the ethical challenges raised by new technologies. He is co‑author of the book Positive Computing and has served as co‑editor of the IEEE Transactions on Technology and Society and on the editorial boards of several leading journals in affective computing and learning technologies.
Joanna Wiaterek is co-founder of the Centre for AI Security and Access with expertise in international AI governance coordination. She has worked on AI benefit-sharing, global AI safety, and foreign aid and is focused on fostering meaningful and inclusive AI futures.
Viktor Rehnberg is a Swedish AI safety researcher based in Gothenburg, Sweden. He holds an M.Sc. in Engineering Physics from Chalmers University of Technology, where he also worked as a Research Engineer at Chalmers e-Commons supporting ML/AI infrastructure. He participated in the SERI MATS (ML Alignment Theory Scholars) Winter 2022 program, conducting research on identifying key steps in reducing risks from learned optimization, including mesa-optimization and inner alignment problems. He has collaborated with Erik Jenner and Oliver Daniels-Koch on empirical mechanistic anomaly detection (MAD) research, co-authoring a LessWrong post on concrete empirical research projects in that area under supervision of John Wentworth and Erik Jenner. He also participated in AI Safety Camp Edition 5, where his team investigated neural network modularity loss functions to improve interpretability. He is an organizer of EA Gothenburg and is motivated by effective altruism, longtermism, and preventing existential risk.
Humanitarian interested in tech
Jason Gross is a computer scientist and entrepreneur, co‑founder of Theorem (Theorem Labs), an AI and programming languages research lab focused on program verification; he holds a PhD in Electrical Engineering and Computer Science from MIT, where his research improved proof assistants and contributed verified cryptography now securing large volumes of HTTPS traffic.
Ajeya Cotra works at METR on threat modeling and risk assessment for loss-of-control risks from advanced AI. She previously led the technical AI safety program at Open Philanthropy (now Coefficient Giving), where she developed the influential “biological anchors” framework for forecasting when transformative AI might arrive.
Aaquib Syed is a CS and Mathematics undergraduate student at the University of Maryland, College Park, where he is a Banneker-Key Scholar. He was a fellow in the MATS 5.0 program under Neel Nanda's supervision, conducting mechanistic interpretability research on how refusal is implemented in large language models. His most notable work, "Refusal in Language Models Is Mediated by a Single Direction" (NeurIPS 2024), co-authored with Andy Arditi and others, showed that refusal behavior across 13 open-source chat models is controlled by a single direction in the residual stream. He also co-authored "Attribution Patching Outperforms Automated Circuit Discovery" (NeurIPS 2024 ATTRIB workshop) and work on mechanistic unlearning and model pruning. He is currently a Student Researcher at Google DeepMind on the Frontier Safety team, where he focuses on evaluating and forecasting dangerous AI capabilities.
No summary available yet.
No summary available yet.
No summary available yet.
No summary available yet.
No summary available yet.
No summary available yet.
Robert Miles (also known as Rob Miles) is a British science communicator and educator specializing in AI safety and alignment. He studied Computer Science at the University of Nottingham and dropped out of his PhD program around 2011 to focus on AI safety communication full time. He began creating content for the popular Computerphile YouTube channel around 2015 before launching his own channel, Robert Miles AI Safety, which has accumulated over 169,000 subscribers and more than 7.5 million views covering topics such as the orthogonality thesis, instrumental convergence, and inner misalignment. He is the founder of AISafety.info (also known as Stampy), a community-written interactive FAQ about AI existential risk, and has run the Distillation Fellowship, a paid program funding writers to distill AI safety research into accessible content for the site. He has co-produced the Alignment Newsletter Podcast with Rohin Shah and has received funding from the Long-Term Future Fund to support his educational work. He collaborates with organizations including MIRI and the Future of Humanity Institute to help communicate their research to broader audiences.
No summary available yet.
German computer scientist and leading machine learning researcher, Scientific Director of the ELLIS Institute Tübingen and Director of the Empirical Inference Department at the Max Planck Institute for Intelligent Systems, whose work focuses on machine learning and causal inference with applications ranging from astronomy and computational photography to robotics.
carer of life
A scientific diplomacy organization working to improve global catastrophic risk governance in Spanish-speaking countries, with focus areas spanning AI regulation, pandemic biosecurity, food security, and risk management systems.
No summary available yet.
Ankit Panda is the Stanton Senior Fellow in the Nuclear Policy Program at the Carnegie Endowment for International Peace, specializing in nuclear strategy, escalation dynamics, missiles and missile defense, space security, and U.S. alliances.
No summary available yet.
No summary available yet.
A 15,000+ page corpus on long-term interaction, symbolic language, unusual model behavior, and safety edge cases.
No summary available yet.
Storyboard artist and 2D animator known for work on the independent series HEATHENS, The Amazing Digital Circus, and Rational Animations’ YouTube videos.
No summary available yet.
No summary available yet.
Columbia University is an Ivy League research university in New York City with significant AI safety, governance, and policy research activity across multiple schools and centers.
No summary available yet.
No summary available yet.
AI safety researcher & pioneer of distributed systems for finance
Building an AI Safety & Responsible AI Ecosystem | New Zealand
Senior Researcher at Rethink Priorities working on existential security, previously involved in time-sensitive COVID-19 forecasting projects and earlier employed as a programmer at Impossible Foods and Google, as well as leading several effective altruism local groups.
6 months of independent alignment research and upskilling
No summary available yet.
President and CEO of Foresight Institute, directing programs in AI, longevity biotechnology, molecular nanotechnology, neurotechnology, and space; founder of ExistentialHope.com, co-editor of Superintelligence: Coordination & Strategy, and co-author of Gaming the Future.
three personalities in one
AI safety dinners
Experienced management consultant and serial entrepreneur. Seeks to maximise personal and professional impact.
No summary available yet.