Can AI Ask Good Questions?

12 Mar, 2025

Thomas Wolf argues the current paradigm of LLMs won't get us Einsteins:

Just consider the crazy paradigm shift of special relativity and the guts it took to formulate a first axiom like “let’s assume the speed of light is constant in all frames of reference” defying the common sense of these days (and even of today…)

...benchmarks test if AI models can find the right answers to a set of questions we already know the answer to.

However, real scientific breakthroughs will come not from answering known questions, but from asking challenging new questions and questioning common conceptions and previous ideas.

When I think of how an LLM might question status-quo, two things come to my mind - hallucination and reliability. Hallucination is closely related to questioning the status-quo - but not really. There's a disctinction in not knowing the rules and breaking the rules. We could ask LLMs today to generate say a million questions, then the issue is signal-to-noise. Some argue you can use LLMs to filter out the fluff, but the LLM-filter will have to be super smart and have great taste.

Closely related, Dwarkesh:

What should we make about the fact that these models require so much training and the entire corpus of internet data in order to be subhuman?

Whereas GPT-4, there's been estimates that it was like 10^25 Flops or something, you can take these numbers with a grain of salt, but there's reports that the human brain, from the time it is born to the time a human being is 20 years old, is on the order of 10^14 Flops to simulate all those interactions.

We don't have to go into the particulars on those numbers, but should we be worried about how sample inefficient these models seem to be?

Scott Alexander offers an explaination:

Scott:

Humans also aren't logically omniscient.

My favorite example of this is etymology. Did you know that "vacation" comes from literally vacating the cities? Or that a celebrity is a person who is celebrated? Or that "dream" and "trauma" come from the same root? These are all kind of obvious when you think about them, but I never noticed before reading etymology sites.

I think you don't make these connections until you have both concepts in attention at the same time, and the combinatorial explosion there means you've got to go at the same slow rate as all previous progress.

Dwarkesh:

.@slatestarcodex 's answer.

I actually agree with this.

But if making new connections and discoveries is intrinsically an NP-hard problem, then it's also less likely that some super-intelligence will rapidly exhaust the tech tree and start making nanobots.

Scott:

I agree that superintelligence can't just brute force every possible combination of ideas and keep the good ones.

I don't think this was most people's conception of superintelligence in the first place, but maybe you're suggesting that if possibilities increase combinatorially there's a hard limit on how smart you can get how quickly?

I think this is probably the wrong way to think about it. Consider the analogy to AI chess. AI chess also faced a rapidly exploding combinatoric problem that no plausible amount of brute force search could ever solve. But we found some good heuristics, those heuristics got better with time, and then the brute force compute was useful for searching what was left over after the heuristics were done. This was still good enough to bring chess AIs from infrahuman to far superhuman in a decade or so. So although chess engine designers do have to think about the combinatoric aspect of their work, I think forecasters would in retrospect have done well to ignore combinatorics and just draw a line through past progress.

Feels like it always comes down to good enough heuristics + search. Is this intelligence? I should read up on how people think about intelligence.

#ai #ai-scaling #links