Niels uit de Bos
Bio
Niels uit de Bos is a mathematician and software engineer working in AI safety and mechanistic interpretability. He holds a PhD in mathematics (algebraic geometry and the geometric Langlands program) from the University of Duisburg-Essen, supervised by Jochen Heinloth, and a Master's degree from Leiden University. He completed a research internship in mechanistic interpretability at MATS 5.0 and 5.1 in 2024, mentored by Adrià Garriga-Alonso at FAR.AI (Foundational AI Research). His primary research output from this period is "Adversarial Circuit Evaluation," a workshop paper at the ICML 2024 Mechanistic Interpretability workshop, which evaluated circuits from the interpretability literature adversarially and found that circuits for the IOI and docstring tasks fail to behave similarly to the full model even on benign inputs. He also co-authored "Relating Piecewise Linear Kolmogorov-Arnold Networks to ReLU Networks," published at AISTATS 2025. Prior to AI research, he had over five years of professional software engineering experience at companies including Zopa, G-Research, and Depict (a Y Combinator ML startup).
Links
- Personal Website
- https://niels.uitdebos.com/
- Twitter / X
- LessWrong
- -
Grants
from Long-Term Future Fund
Discussion
Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.
Details
- Last Updated
- Mar 22, 2026, 11:58 PM UTC
- Created
- Mar 20, 2026, 2:56 AM UTC