6 months salary. Turn intuitions, like goals, wanting, abilities, into concepts applicable to computational systems
Database
Loading results...
Loading results...
Showing 301-350 of 691 results
Clear filters6 months salary. Turn intuitions, like goals, wanting, abilities, into concepts applicable to computational systems
Project Test ApplicationProject Test ApplicationProject Test ApplicationProject Test ApplicationProject Test ApplicationProject Test ApplicationProject Test ApplicationProject Test…
A model-agnostic benchmark for detecting deceptive reasoning in LLMs through behavioral fingerprints — no weight access required.
Starting funds and moving costs for a DPhil project in AI that addresses safety concerns in ML algorithms and positions
No summary available yet.
Identifying and auditing reasoning circuits in LLMs within Algoverse 2026 using Sparse Autoencoders (SAEs).
This grant will support Naoya Okamoto upskill in AI Safety research. Naoya will take the Mathematics of Machine Learning course offered by the University of Illinois at Urbana-Champaign.
Top-up funding for a 3-month new hire trial to help me connect, expand and enable the AGI gov/safety community in Canada
Additional funding for AI strategy PhD at Oxford / FHI
No summary available yet.
Funding for having written AI safety distillation posts on the topic of membranes/boundaries
Funding towards a 2 year postdoctoral stint to work on Safety in AI, with a focus on developing value aligned systems
Support for Marius Hobbhahn for piloting a program that approaches and nudges promising people to get into AI safety faster
Early exploration, agenda-setting, technical infrastructure, and early community building
6-month salary to transition to a career in AI safety while working on AI safety projects
No summary available yet.
4 month salary for independent research and AIS field building in South Africa, including working with the AI Safety Hub to coordinate reading groups, research projects and hackathons to empower people to start working on research.
Support to conduct work in AI safety
Four months of funding for a MATS 5.0 extension, working on improving methods in latent adversarial training
3-month stipend to do independent AI governance research
Building bridges between Western and Chinese AI governance efforts to address global AI safety challenges.
181,448 evaluations proving no production AI model reliably maintains corrections. Expanding coverage and pursuing multi pass validation.
Travel support to attend the Symposium on AGI Safety in Oxford in May
Creating a contest for Robust, Detailed Proposals and Redteaming of AI Safety Plans: Fast Action for Safe Transformative AI
3-month stipend to support research on the state of AI safety in China and implications for AI existential risk
I plan to investigate what realistic RL training conditions might lead to LLMs developing steganographic capabilities.
$10,120 for most of the ops expenses of the research phase (June-Sep 2024) of WhiteBox’s AI Interpretability Fellowship
No summary available yet.
No summary available yet.
No summary available yet.
Compute costs for experiments to evaluate different scalable oversight protocols
4-month stipend to continue work on AI Control as a MATS extension
Stipend and expenses to run the second Athena mentorship program for gender-minority researchers in technical AI alignment
6-month salary for me to continue the SERI MATS project on expanding the "Discovering Latent Knowledge" paper
6-months salary to accelerate my plans of upskilling in order to work on the issue of AI safety
No summary available yet.
The self-study section of AISafety.com curates courses, textbooks, and reading lists for independent learning in AI safety, covering both technical alignment and AI governance.
Year-long stipend to work as the primary maintainer of TransformerLens, and implement large changes to the code base.
Funding For Humanity: An AI Risk Podcast
6-month salary to dedicate full-time to upskilling/AI alignment research tentatively focused on agent foundations
Request for Retroactive Funding
6-month stipend on evaluating robustness of AI agents safety guardrails and for running an AI spear-phishing study
Funding to cover 4-months of rent while attending a research group with the Cambridge AI Safety group
7-month stipend for organising AI Alignment Irvine (AIAI)
No summary available yet.
Does harmful fine-tuning data cause broad misalignment only when the model already recognises the target behaviour as a norm violation?
No summary available yet.
Producing video content on AI alignment
4-month stipend to study refusals and jailbreaks in chat LLMs under Neel Nanda as part of the MATS 5.0 extension program
6-month salary for Fabian Schimpf to upskill into AI alignment research and conduct independent research on limits of predictability