Aidan Ewart

Bristol, UK

Bio

Updated 03/22/26

Aidan Ewart is an AI safety researcher and mathematics undergraduate at the University of Bristol (2022-2026), based in the UK. He has participated in the MATS program as a Research Scholar, working primarily on adversarial robustness and latent adversarial training (LAT). He interned at Haize Labs (July-October 2024) in New York, where he built automated red-teaming infrastructure and contributed to red-teaming of OpenAI's o1 models and contract work with Anthropic. His published research includes co-authoring 'Sparse Autoencoders Find Highly Interpretable Features in Language Models' (ICLR 2024), 'Latent Adversarial Training Improves Robustness to Persistent Harmful Behaviors in LLMs' (NeurIPS 2024), and 'AuditBench: Evaluating Alignment Auditing Techniques on Models with Hidden Behaviors' (2026). He has received funding from the Long-Term Future Fund for independent research on LM interpretability and for a MATS 5.0 extension focused on improving methods in latent adversarial training.