On Jagged AGI: o3, Gemini 2.5, and everything after
Importance: 4 | # | ethan-mollick, agi
Amid today’s AI boom, it’s disconcerting that we still don’t know how to measure how smart, creative, or empathetic these systems are. Our tests for these traits, never great in the first place, were made for humans, not AI. Plus, our recent paper testing prompting techniques finds that AI test scores can change dramatically based simply on how questions are phrased. Even famous challenges like the Turing Test, where humans try to differentiate between an AI and another person in a text conversation, were designed as thought experiments at a time when such tasks seemed impossible. But now that a new paper shows that AI passes the Turing Test, we need to admit that we really don’t know what that actually means.
What's clear is that we continue to be in uncharted territory. The latest models represent something qualitatively different from what came before, whether or not we call it AGI. Their agentic properties, combined with their jagged capabilities, create a genuinely novel situation with few clear analogues. It may be that history continues to be the best guide, and that figuring out how to successfully apply AI in a way that shows up in the economic statistics may be a process measured in decades. Or it might be that we are on the edge of some sort of faster take-off, where AI-driven change sweeps our world suddenly. Either way, those who learn to navigate this jagged landscape now will be best positioned for what comes next… whatever that is.