Felix Michalak
No summary available yet.
Loading results...
Showing 2601-2650 of 3284 results
Clear filtersNo summary available yet.
Showing 2601-2650 of 3284 results
Active filters: Type: Individual, Fund
Clear filters to view everything →No summary available yet.

Jay Bailey is a former software engineer from Brisbane, Australia who transitioned into AI safety research after several years working in software. He participated in the SERI MATS Summer 2022 cohort, studying mechanistic interpretability under Neel Nanda, and subsequently received grants to upskill in ML for AI safety and to collaborate with Joseph Bloom on the Decision Transformer Interpretability project, co-authoring work on feature representations in memory-augmented gridworld agents. After struggling with direct research contributions, he leveraged his engineering background to accelerate his collaborator's research. Recognizing a stronger theory of change in evaluations as governments and labs committed to AI red-teaming, he joined the UK AI Safety Institute (AISI) as a Research Engineer, spending approximately 18 months doing frontier LLM evaluation. He currently works at Arcadia Impact as Head of Technology and Standards, where he contributes to technical AI safety efforts and supports researchers transitioning into the field.
No summary available yet.
No summary available yet.
Paula Quigley is a Community Researcher with the Ada Lovelace Institute, working with communities to explore public perspectives on artificial intelligence and its societal impacts. She designs and facilitates workshops that bring diverse and underrepresented voices into AI policy conversations and governance, drawing on senior leadership experience across housing, social enterprise and community development in Northern Ireland.
No summary available yet.
No summary available yet.
Community member bio
Machine learning PhD student at the University of Oxford whose research focuses on reasoning, multi-agent systems, post-training and AI safety, and a co-author on work with Contramont Research on cryptographic backdoors in language models accepted at NeurIPS 2024.
Heron co-founder and senior AI security and policy researcher at the Institute for AI Policy and Strategy (IAPS).
Shane Tews is a nonresident senior fellow at the American Enterprise Institute, where she focuses on cybersecurity, internet governance, and technology and innovation policy, and president of Logan Circle Strategies, advising clients on global public policy for information and communications technologies.
No summary available yet.

David Quarel is a PhD student at the Australian National University (ANU), supervised by Marcus Hutter, where he researches AI safety, Universal Artificial Intelligence, and Mechanistic Interpretability. He holds a BSc in Physics and Mathematics and an MComp specialising in AI and Machine Learning. He co-authored the textbook "An Introduction to Universal Artificial Intelligence" (Routledge, 2024) alongside Marcus Hutter and Elliot Catt. Quarel serves as Head TA at ARENA, an AI safety education programme run by the London Initiative for Safe AI (LISA), where he develops course content and teaches technical AI safety topics. He previously worked as a research assistant at the Krueger AI Safety Lab (KASL) at the University of Cambridge, and received funding to support that residency period. He has several years of teaching experience at ANU across mathematics, theoretical computer science, and digital hardware design.
No summary available yet.
No summary available yet.
French-American computer scientist and pioneer of deep learning; Turing Award laureate and professor at New York University, known for work in artificial intelligence, machine learning, computer vision, robotics and image compression.

Alan Chan is a Research Fellow at the Centre for the Governance of AI (GovAI) in London, where he focuses on AI agent governance, transparency, and technical AI governance more broadly. He completed his PhD in Computer Science at Université de Montréal / Mila (Quebec AI Institute) in 2024, advised by Nicolas Le Roux and David Krueger, and holds an MSc and BSc from the University of Alberta. During his doctoral work, he conducted a research visit with David Krueger at Cambridge focused on evaluating non-myopia in language models and RLHF systems, work motivated by the view that non-myopia is a precursor to dangerous emergent properties like deceptive alignment. His research spans development alignment evaluations (cooperativeness, corrigibility), capability evaluations (non-myopia, deception), AI agent infrastructure and governance, model transparency, and incident analysis for autonomous systems. He has also been affiliated with the Bennett School of Public Policy at the University of Cambridge as a visiting researcher.
No summary available yet.
No summary available yet.
No summary available yet.
Conrad Stosz is Head of Governance at Transluce, where he leads work on AI evaluation standards and policy. He previously led the U.S. Center for Standards and Innovation and has held AI policy roles across the White House, Congress, and the Department of Defense, building on prior experience as a machine learning engineer.
No summary available yet.
PhD student at ETH Zurich, advised by Florian Tramèr, focusing on security and failure modes of artificial intelligence.
AIS researcher, PhD student at CHAI
No summary available yet.
Logan Riggs Smith is an independent AI safety and mechanistic interpretability researcher who goes by the handle "elriggs" on LessWrong and the Alignment Forum. He earned a BS and MS in electrical and computer engineering from Mississippi State University (2014-2021), where he focused on machine learning and wireless signal processing. He is best known as a co-author of "Sparse Autoencoders Find Highly Interpretable Features in Language Models" (ICLR 2024), alongside Hoagy Cunningham, Aidan Ewart, Robert Huben, and Lee Sharkey, an influential paper that helped establish sparse autoencoders as a core technique for mechanistic interpretability. He also contributed to shard theory research with Quintin Pope, Alex Turner, and Charles Foster. The Long-Term Future Fund supported Logan for over two years with six-month stipends of $40,000 each, funding his work on sparse autoencoders and language model tools for alignment research.

Kush Bhatia is a Research Scientist at Google DeepMind in San Francisco, having previously completed a postdoctoral fellowship at Stanford University under Christopher Ré. He earned his PhD in Electrical Engineering and Computer Sciences from UC Berkeley in 2022, where he was co-advised by Peter Bartlett and Anca Dragan, and his dissertation was titled "Learning when Objectives are Hard to Specify." Before Berkeley, he completed his undergraduate degree in Computer Science at IIT Delhi and spent two years as a research fellow at Microsoft Research India working with Prateek Jain and Manik Varma. His research spans statistical machine learning, high-dimensional statistics, optimization, and AI alignment, with a particular focus on problems at the intersection of human feedback and learning system objectives, including reward misspecification, reward hacking, and developing value-aligned systems. Notable works include "The Effects of Reward Misspecification: Mapping and Mitigating Misaligned Models" (ICLR 2022), "On the Sensitivity of Reward Inference to Misspecified Human Models" (ICLR 2023), and contributions to large language model prompting and training methodology. His postdoctoral work on safety in AI and value-aligned systems was supported by the Long-Term Future Fund.
Intercultural philosopher and co-founder of the Buddhism & AI Initiative, Adjunct Senior Fellow and former director of the Asian Studies Development Program at the East-West Center in Honolulu, and author of works including Buddhism and Intelligent Technology (2021) and Consciousness Mattering (2023).
Lukas Fluri is a PhD student in Computer Science at ETH Zurich, supervised by Prof. Florian Tramèr in the SPY Lab, where he researches when and how AI systems fail and how to prevent this. He holds a BSc in Computer Science and an MSc in Data Science, both from ETH Zurich, and was awarded an ETH Medal for his Master's thesis "Evaluating Superhuman Models with Consistency Checks," which proposed a framework for surfacing mistakes in superhuman AI models using logical consistency checks. His research spans AI safety, interpretability, model evaluation, red-teaming, reinforcement learning, and the science of deep learning, covering both theoretical and empirical approaches. Prior to his PhD, he completed research internships at the University of Cambridge and UC Berkeley, during which he received Long-Term Future Fund support for an unpaid internship focused on using theory and interpretability to increase the safety of AI systems. He is also involved with Zurich AI Safety (ZAIS), a community organization focused on AI safety capacity building in Switzerland.
Dmitrii (Dima) Krasheninnikov is an AI safety researcher who completed his PhD in machine learning at the University of Cambridge in December 2025, supervised by David Krueger and Rich Turner, and subsequently joined Anthropic. He holds an MSc in AI from the University of Amsterdam (cum laude) and previously held research positions at UC Berkeley's Center for Human-Compatible AI and Sony AI Zurich. His research spans interpretability, the science of deep learning, control, and security, with a focus on ensuring advanced AI systems remain aligned with human values. He is known for coining the term "out-of-context learning" and for demonstrating that language models linearly encode the training-order of facts in their activations. He also co-authored "Defining and Characterizing Reward Hacking" (NeurIPS 2022) and has published work at NeurIPS 2024/2025, ICML 2024, and ICLR 2026. He has received funding from the Long-Term Future Fund for his PhD research in AI alignment.
James Balzer is an Australian strategic foresight practitioner and the Foresight Lead at the Odyssean Institute, where he works with governments, international organisations and businesses on scenario mapping, horizon scanning and sense-making to build long-term resilience. He serves on the steering committee of the Next Generation Foresight Practitioners network, leads the Intergenerational Fairness in Cities community of practice at the School of International Futures, and previously helped found the World Economic Forum’s Future 50 Initiative to upskill young people in foresight. He also conducts research on anticipatory governance and carbon market reform with the Disruptive Futures Institute and holds teaching and advisory roles with institutions including Macquarie University and the Lee Kuan Yew School of Public Policy.

Thomas M. Kehrenberg is a machine learning researcher currently based at the Basque Center for Applied Mathematics (BCAMATH) in Bilbao, Spain, where he works as a researcher in machine learning. He completed his PhD at the University of Sussex in 2021 with a thesis titled "Learning with biased data: invariant representations and target labels," and subsequently held a visiting research fellowship there. His primary academic research focuses on fairness and bias mitigation in machine learning, including adversarial support-matching and null-sampling techniques for interpretable and fair representations, with publications at venues such as ECCV and TMLR. In 2022, he received a grant from the Long-Term Future Fund (LTFF) for a six-month self-study period to build background knowledge for AI alignment research, during which he studied topics including VNM rationality, type theory, and topology. He subsequently wrote a post on LessWrong sharing advice for others undertaking similar alignment self-study, and has also published on the Alignment Forum exploring finite factored sets.
No summary available yet.
No summary available yet.
Operations Specialist at Probably Good with extensive experience supporting teams to achieve meaningful results; previously worked as a lawyer in Canada, developing processes and documentation to help clients navigate the legal system.
No summary available yet.
Assistant Professor of accounting at Mississippi State University whose research focuses on nonprofit accounting and how donors use accounting information, and who also serves as a trustee of CEEALAR.
No summary available yet.
No summary available yet.
Robert Farias is Senior Director of Partnerships at Mila – Quebec Artificial Intelligence Institute, where he leads partnership management and business development, building strategic links between Mila researchers, the broader AI ecosystem, and external partners so they can maximize the value of their collaboration with Mila.
Some random dude.
No summary available yet.
No summary available yet.
No summary available yet.
I maintain Agent Threat Rules (ATR), an MIT-licensed detection rule corpus for AI agent attacks. 344 rules. In production at Microsoft, Cisco, MISP, OWASP.
No summary available yet.
Director of Sustainability at Condé Nast, responsible for developing the company’s first global sustainability strategy, with an academic background in environmental policy and human rights from Sciences Po Paris.

Milan Griffes is a Principal at Lionheart Ventures, a venture capital firm based in Portland, Oregon, where he focuses on early-stage investments. He began his career as a Research Analyst at GiveWell (2014-2016), the leading charity evaluator, before pursuing an MHS in Mental Health at the Johns Hopkins Bloomberg School of Public Health — funded in part by an EA grant. He holds a bachelor's degree in History and Music from Michigan State University and studied Classics in Rome. He subsequently co-founded Atman, a psychedelic retreat that reached cash-flow positive status before being acquired by Odyssey in 2024, and served as Head of Risk at Sendwave, a Y Combinator-backed fintech company whose business unit was sold to WorldRemit in 2020 for $500M. Griffes is an advisor to the Qualia Research Institute and has been a prolific contributor to the EA Forum (over 4,500 karma, 100+ posts) writing on topics including AI safety, catastrophic risk, psychedelics as a cause area, effective altruism culture, and mental health. His personal blog, Flight From Perfection, covers ethics, contemplative practice, and social phenomena.