Addressing Immediate AI Safety Concerns through DevInterp
Database
Loading results...
Loading results...
Showing 101-150 of 308 results
Clear filtersAddressing Immediate AI Safety Concerns through DevInterp
Seeding a business which finds grants and High Net Worth Individuals beyond EA
No summary available yet.
by buying gift cards for the game and handing them out at the OpenAI offices
No summary available yet.
Benchmark for agent safety when spending users money. How often do they violate user intent and rules?
Surveying neuroscience for tools to analyze and understand neural networks and building a natural science of deep learning
No summary available yet.
The first AI safety evaluation benchmark for Nigerian indigenous livestock systems testing whether frontier models are safe to deploy in African food systems.
This round of funding will be used primarily for prototype hardening, artifact packaging, runtime evaluation, and preparation for external review.
Advocating for U.S. federal AI safety legislation to reduce catastrophic AI risk.
I've self funded my ramp up for six months and interview/grant processes are taking longer than expected.
Promoting better management of Global Catastrophic Risks in Spanish-Speaking countries.
No summary available yet.
No summary available yet.
Organizing global AI ethics think tank for dynamic AI research updates and framework for AI safety policies implementation and humanity income support
Running the initial online version of a 4-week biosecurity course for 20-50 participants
LLMs often know when they are being evaluated. We’ll do a study comparing various methods to measure and monitor this capability.
Evitable is a nonprofit that informs and organizes the public to confront societal-scale risks from AI and put an end to the reckless race to develop superintelligence.
Social media content across YouTube, Instagram, and TikTok to grow AI x-risk awareness and build political momentum for a global pause.
Inspiring India’s Middle‑Schoolers to pursue AI Safety, Governance, and X‑Risk Work
No summary available yet.
Developing an innovative wisdom layer for AI that enhances its capabilities for deep analysis, safe AI, and creative solutions to complex systemic problems.
De-risking AI Catastrophe: A cyber-physical protocol using ZKPs and NIR Spectroscopy to resolve the governance deadlock in critical global infrastructure.
Developing correct-by-construction world models for verification of frontier AI
No summary available yet.
CaML researches how synthetic pretraining data can shift AI systems towards greater compassion and moral open-mindedness regarding all sentient beings, including animals and potential digital minds.
We build a scalable "Automated Circuit Discovery" method and investigate "Cleanup Behavior" to advance the interpretability of transformer models.
Measured post-embodied sensation integration. Solo daily-pace brain-function development in Osaka. Phase 1 funds higher cognitive integration program.
Triadic geometric training data and architecture replaces RLHF
Six-month support for a Program Manager to organize and execute international AI safety hackathons with Apart Research
Do ACE-style cost-effectivness analysis of technical AI safety orgs.
Sage builds tools to improve forecasting skills and public understanding of AI capabilities, with the goal of reducing global catastrophic risks from emerging technologies.
AI Futures Project
Independent collective. Φ-Arena open benchmark, 3 ICLR 2027 papers (Φ-Arena, mechinterp, energy-bounded) — kickstart for a 10-year program.
Non-profit facilitating progress in AI safety R&D through events
No summary available yet.
Short Documentary and Music Video
3 month salary for AI safety work on deconfusion and technical alignment.
A Coherence based Emergent Protocol
Doom Debates is a podcast and debate show hosted by Liron Shapira focused on high-stakes debates about AI existential risk. Its mission is to raise mainstream awareness of potential extinction from AGI and build social infrastructure for high-quality public discourse on the topic.
Practicing Embodied Protocols that work with Live Interfaces
No summary available yet.
1-year salary for independent research to investigate how LLMs know what they know.
Enabling Compassion in Machine Learning (CaML) to develop methods and data to shift future AI values
A virtual pet simulator that teaches reinforcement learning failures through simple and fun interactions.
A Baltimore-based nonprofit media platform that produces podcasts, videos, and social content to bring AI extinction risk into mainstream public conversation.
Ship It: Building Bridges for Better AI Outcomes
An AI safety research initiative developing new adaptive theoretical frameworks and AI interface designs to keep human sensemaking at pace with rapidly advancing AI systems.
An advanced agent that perceives your screen and executes tasks by controlling the mouse, acting as a digital proxy to handle complex work on your behalf.