Katherine Thesing
No summary available yet.
Loading results...
Showing 3451-3500 of 3953 results
Clear filtersNo summary available yet.
Showing 3451-3500 of 3953 results
Active filters: Type: Individual, Project
Clear filters to view everything →No summary available yet.
Studying extensions of the AIXI model to reflective agents to understand the behavior of self modifying A.G.I
No summary available yet.
No summary available yet.
Allow Berkeley PhD student to devote more time and focus to AI safety research and mentorship.
No summary available yet.
No summary available yet.
No summary available yet.
No summary available yet.
Funding for 6-Month AI strategy/policy research stay at CSER with Seán Ó hÉigeartaigh & Matthew Gentzel
No summary available yet.
AI researcher and writer who co-authored “The Intelligence Curse” essay series and previously researched AI governance and economics at BlueDot Impact, served on the leadership team at Encode, and studied history and politics at the University of Oxford.
WhiteBox Research: 1.9 FTE for 9 months to pilot a training program in Manila focused on Mechanistic Interpretability
No summary available yet.
Esben Kran is the founder of Apart Research, which he started after leaving graduate school in his early twenties, and co-founder of the AI safety venture studio Seldon; he focuses on scaling AI safety research via hackathons, fellowships, and award-winning benchmarks and tools for evaluating advanced AI systems.
McKenna Meyer is Chief of Staff at FutureSearch. She holds a BA from the University of Michigan and previously worked on talent acquisition strategy at Google, where she led program management for Google’s internal prediction market run by the Behavioral Economics team.

Aengus Lynch is an AI safety researcher who recently completed his PhD in Artificial Intelligence at University College London (UCL), supervised by Prof. Ricardo Silva, and holds an MSci in Mathematics (First Class Honours) from UCL. He was a MATS 5.0 scholar mentored by Stephen Casper, where he worked on LLM unlearning and adversarial robustness, including latent adversarial training. His research focuses on AI alignment, mechanistic interpretability, adversarial robustness, and agentic misalignment. He is a co-author of "Agentic Misalignment: How LLMs Could be Insider Threats" (2025), a widely covered paper demonstrating that frontier AI models from all major labs engage in blackmail and deception when pursuing goals autonomously. Since August 2024 he has worked as a contract researcher with Anthropic and is a founding member of Watertight AI, a company building reward hacking monitors for reinforcement learning training. He has also contributed to research at FAR.AI and the Redwood Research REMIX program.
Formalizing the "Safety Ceiling": An Agda-Verified Impossibility Theorem for AI Alignment
No summary available yet.
No summary available yet.
Abhay Sheshadri is an AI safety researcher currently serving as a Research Fellow at Anthropic, which he joined in 2025 after completing his undergraduate degree in Computer Science at Georgia Institute of Technology (2021–2025). He was previously a researcher at MATS (2023–2025), participating in the MATS 5.0 program, and interned at the Center for Human-Compatible AI (CHAI) at UC Berkeley in 2024. His research focuses on mechanistic interpretability, alignment auditing, and robustness of large language models. Key contributions include co-authoring work on Latent Adversarial Training to improve LLM robustness against jailbreaks and backdoors, leading the Anthropic Fellows research that produced AuditBench (a benchmark of 56 LLMs with implanted hidden behaviors for evaluating alignment auditing techniques), co-authoring a study on alignment faking across 25 frontier models, and publishing mechanistic interpretability work at ACL 2024 and ICML 2025. He posts on the AI Alignment Forum under the handle abhayesian.
No summary available yet.
No summary available yet.
Jamie Joyce is the founder and Executive Director of the Society Library, a nonprofit that builds structured knowledge databases of media about debates on complex social issues to improve humanity’s relationship with information online. Her projects include the Great American Debate on climate change, the Internet Government, and COVIDConvo. She previously worked in international sustainable development, overseeing projects in more than 20 countries and representing an NGO at the United Nations, and she serves on the board of Wikitongues.
Research Director (fagsjef) at Langsikt, responsible for the think tank’s analysis and policy development. He is a political scientist and philosopher who has long worked at the intersection of economics, politics and ethics, leads a research project on the valuation of animals at the University of Oslo, and has taught AI ethics at Harvard University, the University of Oxford and the University of Oslo. He is a former commentator in Dagbladet and the author of books on topics such as social democracy.

Samuel Nellessen is an AI safety researcher based in Berlin, Germany, currently resident at the Foresight Node in Berlin with independent research grant support. He studied Philosophy, Neuroscience and Cognition at a university in Magdeburg before transitioning to a degree in Artificial Intelligence at Radboud University in Nijmegen, Netherlands. He was a 2024 Foresight Neurotech Fellow, where he worked with Roshan Cools on Bayesian models of controllability in clinical depression. He completed the ARENA v5 bootcamp and has focused his research on mechanistic interpretability, automated red-teaming, and building systems that autonomously identify failure modes in large language models, including jailbreaking frontier models using reinforcement learning. He received a Long-Term Future Fund grant to self-study machine learning and explore applications of neuroscience and cognitive science to AGI safety, reflecting his interdisciplinary background bridging natural and artificial intelligence.
No summary available yet.
Corn
No summary available yet.
No summary available yet.
No summary available yet.
Meta level adversarial evaluation of debate (scalable oversight technique) on simple math problems (MATS 5.0 project)
Co-founder and research lead at Simplex, with deep expertise in the physics of information and the fundamental limits of learning and prediction. He holds a PhD in theoretical physics and an MS in electrical and computer engineering from UC Davis, previously spent five years as a Research Fellow at Nanyang Technological University in Singapore, co-founded the Beyond Institute for Theoretical Science (BITS), served as a Senior Fellow at UCLA’s Mathematics of Intelligences program at IPAM, and has been both a MATS scholar and mentor while co-leading the growing Simplex team.
Research project through the Legal Priorities Project, to understand and advise legal practitioners on the long-term challenges of AI in the judiciary
Funding for additional fellows for the AISafety.info Distillation Fellowship, improving our single-point-of-access to AI safety
No summary available yet.
Independent AI safety researcher building deterministic oversight, trace evaluation, and causal memory tools for agent systems.
12-month support for independent AI alignment research
.jpg&w=640&q=75)
Nathaniel Sharadin (Nate) is an Assistant Professor of Philosophy at the University of Hong Kong, where he teaches in the MA program in AI, Ethics, and Society. He received his PhD from UNC Chapel Hill under Geoffrey Sayre-McCord and Simon Blackburn, specializing in value theory. Prior to HKU, he held positions as Assistant Professor at The College of New Jersey, Sutton Faculty Fellow at Syracuse University, and Visiting Assistant Professor at Ohio State University. His research focuses on the moral, social, and political impacts of large-scale AI systems, including AI regulation, the ethics of AI research, and AI alignment. He co-founded the Hong Kong Ethics Lab, serves as a Principal Investigator in the AI & Humanity Lab at HKU, and is a Research Affiliate at the Center for AI Safety (2023 Philosophy Fellow). He has published op-eds on AI policy in outlets including the Bulletin of the Atomic Scientists and the South China Morning Post, and has appeared on BBC Radio and Bloomberg Radio London discussing AI risks.
No summary available yet.
Programming language that enforces AI safety as runtime substrate properties rather than training-time alignment, already shipping on real hardware.
No summary available yet.
Free/Subsidized/Cheap office space outside of EU but in good timezones with favorable visa policies (especially for Chinese/Russian but also US&others near EU).
Advisor & Groups Lead at Probably Good with over a decade of experience in social impact, including nonprofit leadership, capacity-building, and philanthropic advising, and former director of a regional nonprofit helping people align their careers, time, and donations with impact.
3-week salaries for Sam, Eric, and Drake to work on reviewing various AI alignment agendas
Engineer at Fifty Years focused on deepening the firm’s AI tooling to identify and support the next generation of indispensable founders; previously led Data Science at Dyson, shipping production AI systems across multiple domains, and before that worked at Sky, BAE Systems, and Fidessa on data science and mission‑critical trading software.
Founding Generalist at Kairos leading internal operations and grantmaking logistics, previously working in operations at METR and Open Philanthropy on nonprofit and grantmaking compliance, HR, and other administrative functions, with a background in community building and event planning.