Karpathy Software 3.0 Talk
Importance: 6 | # | karpathy, ai, ai-tools, mine
Karpathy's talk is on youtube. He goes through many interesting points in his usual elegance. The thing that struck me the most is this slide:
The way we are getting a lot of the work done right now is by having the AIs generate and us verifying. Verification is the bottleneck.
GUIs help with verification - using our brain's GPU (visual processing) instead of just CPU (text processing)
— Karpathy (paraphrased)
And the key to getting things done faster and at a higher throughtput is to maximize the pace at which we run this loop to get work done. That does not mean 1000 line diffs, not reliable enough yet. Usually we are looking for small edits to things we can fully grasp.
Karpathy puts it like this 'keep the AI on a tight leash', or the verification will become that much harder. Of course some day we won't really be able to keep the leash and likely willingly give up at first.
I myself only like making small code changes (I've always vastly preferred smaller lines of code). I do ask for massive changes as well, but first I like to either plan it myself or have an AI plan it and for me to verify it first. Comes down to 9s of reliability - we aren't yet at the place where we can blindly accept even small code changes.
Throwing a hard problem at a frontier LLM today and having it generate 10^x solutions will lead to the LLM solving some of them some of the times. The genius is there. It's the verification we are bottlenecked on. For example see this, where LLM finds a zero-day vulnaribitlity sometimes on the order of 1 in 100.
Verification is easier for LLMs. They mostly get it right, but the mostly isn't good enough right now. Working on agentic systems with a massive focus on verification will pay dividents. And the key with verification is to keep the rate of false negatives at essentially 0 and get the false positive as low as possible.