Since Daniel Kahneman’s Thinking, Fast Slow, the idea that there are two competing systems in our brains is a very well known idea, and one that has been around for quite a while.
The common wisdom is that when we are generally lazy and rely on System 1 by default - it’s quick and low effort, but has many flaws. When we are sufficiently focused and it’s needed, we “switch” over to System 2 - it’s slow and hard work but the end result is better. These are sometimes described as two distinct systems, although the reality is that they work together.
In the context of artificial intelligence, we might consider LLMs as being primarily intuitive, System 1 thinkers. This is a common criticism of LLMs, that they lack (advanced) reasoning abilities, as if there are two distinct ways to think and we have only built one of them. But, we quickly discovered that we can encourage more System 2 like thinking with prompting - “Let's walk through step by step” for example. Since then, Tree-of-thought prompting has taken this even further, evaluating multiple reasoning steps in each iteration with a prompted panel of “experts”.
Both of these approaches based on the idea that we can reason our way to an answer forwards in a single pass. There is no doubt they are extremely effective, but what if that's not how humans reason?
I first read about this idea in the book - The Mind is Flat, by Nick Chater. Here's the basic idea:
Reasoning is backwards - from answer to reason, as opposed from reason to answer. When we make the majority of our decisions throughout the day, we do so intuitively. When someone asks us “why / why did you do that / do it that way” etc, we make up a reason on the spot. We are incredibly good at generating reasons for things, so good in fact that we don't even realize we are making them up!
This often gives us the illusion that we actually used reasoning to make all these decisions. even though (often) no such reason was involved in our decision at the time - it just felt right.
This view that we can engage our reasoning system to replace our intuition come to a logical and more correct answer is thrown up in the air, we are much more intuitive than we think and it’s almost impossible to not use intuition, often to our detriment.
The argument is taken further in The Enigma of Reason by Hugo Mercier and Dan Sperber, which hammers home the point that reasoning is backwards, but also suggests that reasoning developed a social faculty more than anything else - “Our hypothesis is that the function of reasoning is argumentative. It is to devise and evaluate arguments intended to persuade”. Presenting good reasons for our answers, actions and decisions to other humans is crucial for developing trustworthiness, and showing we have integrity and are consistent. Someone who presents good reasons gains social status, and this is beneficial to our survival. The societal impact of this is explored by Lionel Page in Why Reason Fails.
Centuries of scientific progress would imply that at least some of us are able to actually apply rigorous logic to deduce answers and solve problems, and reasoning can influence our actions and beliefs, so what's going on?
Given an intuition, we may, without being explicitly asked by others - ask ourselves what the reason is. Whether out of habit, or due to us anticipating that we may be asked to justify ourselves in the future. If our reason doesn't seem sound, we may search for other reasons until we find something that holds up. If we fail to find a good reason, or stumble across a reason against - then we may have to reconsider our initial intuition driven conclusion.
This process looks a bit like a heuristic based search. Our intuition (the heuristic) given any decision, question etc drives us toward certain parts of the solution space and proposes an answer. We then try to find good reasons for the answer, a sort of validation process to make sure that our answer is sound, or at least that we would not be embarrassed to state our answer along with our reason for it to others. Coming up with reasons itself may also be search problem, with it's own validation step (is this a good reason, if not, look for another).
Here’s an example in a domain where we expect a more rigorous approach to decision making.
A bat and ball cost
Your intuition will probably drive you to an initial answer, and if you've not seen this before or don't have much experience with algebra, you will have probably considered the ball costing 10c. If you've seen this or similar problems in the past, your intuition might be 5c.
In either case, the next thing to do is validate the answer (find a reason).
If you are mathematically inclined, you may have observed that this is basic algebra and solved the problem that way. I discuss this more below.
This is a very general pattern of searching with a heuristic and validation, that we have used to solve many hard problems in various domains.
Chess has mostly been approached as a search problem. We search the possible sequences of moves and then we evaluate them. As a very ameteur chess player, this certainly feels like how I actually approach chess - my (fairly poor but better than random) intuition glances at the board and suggests which piece I should move and, maybe where I should move it. I then evaluate that position, in effect looking for a reason for it (moving my knight to XX is a good move as it is protected and it forks my opponents queen and rook). If there’s no good reason for the move, or a bad reason, I will trigger my search again, looking at another possible move, driven by my intuition.
Expert chess players have very good intuitions about moves. The original chess computers didn’t have human level intuition, but could evaluate millions of possible sequences of moves in a very short time, i.e a much deeper search, and this was enough to give them an advantage.
AlphaZero/Go gave this search algorithm something it didn’t have before - intuition. It didn’t waste time considering bad moves as it had learnt a good intuition for which parts of the search space to explore first. It also got taught how to have intuitions about the value of board positions (intuitions about reasons), in this case you could consider “reason” to be a resulting position and the value of it.
The original AlphaZero paper evaluates the ELO of the “intuition” alone without any lookahead (search), and it’s still pretty good at ~3000 ELO compared to ~5000 with deep search. In other words, it’s still a GM using only intuition or “System 1” thinking.
Not all processes are search, Sometimes, we spot that there is a specific process to solving a problem. In this case it’s some fairly simple algebra in order to work out the correct answer. These processes we have developed over centuries, they are written down and passed down from generation to generation. Looking back at the bat and ball example:
While this is particular example is a simple procedure, the process of transforming an equation into one that actually gives us an answer, is itself a nested search problem (a path of transformations that gives us the left hand side expression that we want), which mathematicians have very good intuitions for.
Much of our day to day lives is governed by processes and habits that we have memorized. In fact, probably even a higher proportion of our day to day actions are governed by habits than intuitions, and an even smaller fraction that uses a reasoning loop as described above.
How might this way of thinking about reasoning impact how we approach reasoning in LLMs?
Rather than encouraging systems to find a (forward pass) reasoned argument toward an answer, we might want to employ reasoning as a backwards looking step and utilize it in a search process.
There is lots of exploration in the space. Tree of Thought prompting as mentioned, which builds upon Chain of Thought prompting and demonstrates a significant improvement in performance in some cases. While ToT isn’t really a full search, others have applied something more like a true search process, wrapping multiple LLM calls in a process in certain domains such as code generation with AlphaCodium.
There’s nothing new here that we don’t have today. It’s believable that reasoning is built from intuitions, and intuitions about reasons - there are only intuitions, just about different things! We don’t necessarily need to build something fundamentally new or different, we might just need to wrap it in a process, in this case our process is based on search.