Niels uit de Bos

Bio

Updated 03/22/26

Niels uit de Bos is a mathematician and software engineer working in AI safety and mechanistic interpretability. He holds a PhD in mathematics (algebraic geometry and the geometric Langlands program) from the University of Duisburg-Essen, supervised by Jochen Heinloth, and a Master's degree from Leiden University. He completed a research internship in mechanistic interpretability at MATS 5.0 and 5.1 in 2024, mentored by Adrià Garriga-Alonso at FAR.AI (Foundational AI Research). His primary research output from this period is "Adversarial Circuit Evaluation," a workshop paper at the ICML 2024 Mechanistic Interpretability workshop, which evaluated circuits from the interpretability literature adversarially and found that circuits for the IOI and docstring tasks fail to behave similarly to the full model even on benign inputs. He also co-authored "Relating Piecewise Linear Kolmogorov-Arnold Networks to ReLU Networks," published at AISTATS 2025. Prior to AI research, he had over five years of professional software engineering experience at companies including Zopa, G-Research, and Depict (a Y Combinator ML startup).

Community Signal

Updated 03/22/26

0Upvotes

0Downvotes

0Endorsements

No endorsements yet.

Grants

Updated 03/22/26

LTFF 2024 Q1 - Niels uit de Bos

from Long-Term Future Fundfunds.effectivealtruism.org

recipient$62,000

Niels uit de Bos

Bio

Community Signal

Links

Grants