Andrei Alexandru

San Francisco Bay Area, USA

Bio

Updated 03/22/26

Andrei Alexandru is an AI safety researcher and machine learning engineer currently working at iGent AI on Maestro, an autonomous agentic coding agent. He holds an MPhil in Machine Learning from the University of Cambridge, where his dissertation examined the inductive biases of shallow neural networks, funded by a Long-Term Future Fund grant. He previously worked on the dangerous capability evaluations team at OpenAI and at Atla, where he was first author on the Atla Selene Mini paper introducing a state-of-the-art small language model-as-a-judge. Earlier in his career he participated in the SERI MATS program under Evan Hubinger's mentorship and the ML for Alignment Bootcamp at Redwood Research. His research interests include mechanistic interpretability, AI evaluations, and understanding deceptive alignment, and he maintains the blog inwaves.io where he writes about AI safety topics.