Aidan Ewart
Bio
Aidan Ewart is an AI safety researcher and mathematics undergraduate at the University of Bristol (2022-2026), based in the UK. He has participated in the MATS program as a Research Scholar, working primarily on adversarial robustness and latent adversarial training (LAT). He interned at Haize Labs (July-October 2024) in New York, where he built automated red-teaming infrastructure and contributed to red-teaming of OpenAI's o1 models and contract work with Anthropic. His published research includes co-authoring 'Sparse Autoencoders Find Highly Interpretable Features in Language Models' (ICLR 2024), 'Latent Adversarial Training Improves Robustness to Persistent Harmful Behaviors in LLMs' (NeurIPS 2024), and 'AuditBench: Evaluating Alignment Auditing Techniques on Models with Hidden Behaviors' (2026). He has received funding from the Long-Term Future Fund for independent research on LM interpretability and for a MATS 5.0 extension focused on improving methods in latent adversarial training.
Links
- Personal Website
- https://bayesteezian.net/
- Twitter / X
- LessWrong
- baidicoot
Grants
from Long-Term Future Fund
from Long-Term Future Fund
Discussion
Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.
Details
- Last Updated
- Mar 22, 2026, 1:43 PM UTC
- Created
- Mar 20, 2026, 2:46 AM UTC