Thoughts on setting policy for new AI capabilities
Importance: 4 | # | ai, chatgpt, openai
// I lead model behavior at OpenAI.
tl;dr we’re shifting from blanket refusals in sensitive areas to a more precise approach focused on preventing real-world harm. The goal is to embrace humility: recognizing how much we don't know, and positioning ourselves to adapt as we learn.
When it comes to launching (what feels like) a new capability, our perspective has evolved across multiple launches:
Trusting user creativity over our own assumptions...
Seeing risks clearly, but not losing sight of everyday value to users. It’s easy to fixate on potential harms, and broad restrictions always feel safest (and easiest!)...
Valuing unknown, unimaginable possibilities. Maybe due to our cognitive bias against loss aversion, we rarely consider the negative impacts of inaction; some people refer to it as “invisible graveyards” although that’s a bit too morbid and extreme. There are second order or indirect impacts unlocked by a new capability: all the positive interactions, innovations, and ideas from people that never materialize simply because we feared the worst-case scenario.
Always good to hear directly from the actual people doing frontier work.