Every now and then the “Ask Marilyn” column by provides the basis of a challenging decision theory or statistics problem. Famously she excited comments over the Monte Hall paradox.
Recently she discussed an issue that is confusing to many people, often including professionals. The issue starts out simply: you will toss a fair six-sided die 20 times and record the results. Before you start tossing, which is more likely to be the result: 11111111111111111111, or 66234441536125563152? Contrary to what you might naively think, either result is equally likely. Either pattern is just as likely to occur as any other pattern you can write down before the tossing starts. All twos, all threes; it does not matter. The probability of all patterns is the same.
Now it gets interesting and stirs up some controversy. Suppose I have already thrown the die and tell you that the result was one of the two given above. Which one is more likely to be the correct result? After having thought through the initial problem and agreeing that any pattern is equally probable, then one can obviously assume that the same answer applies regardless of whether anyone has seen the results or not — and that would be true. The probability does not change before or after the event, but that is the source of misunderstanding. What one does with the results changes the answer.
The probability of which set of numbers represents the results of tossing the die is changed dramatically because an observer recorded them and out of all the possible configurations (six raised to the 20th power — a really big number), gave you only two to choose from. This subtle shift has been slipped in and changes the problem completely. That is, the possible results have been filtered by the observer.
You could approach the problem by computing the relative probability that 20 tosses will produce exactly 20 ones (1/6**20), and compare it to the probability that 20 tosses produces a pattern essentially like the other one. However, I prefer to think in terms of pattern and symmetry. Since the die is fair, we expect the output to be random and without a pattern or symmetry. That will be the case with the overwhelming majority of possible results. Some results might have patterns such as 123454612345612345612, but most will not have a readily perceivable pattern. That observation itself is enough to warn us that all ones is unlikely to be the result compared to the only alternate given as a possible choice — this is different from being compared to all possible configurations.
That is, the conditions of the puzzle have eliminated [(6**20) - 2] possible choices and asks us to choose between the two remaining possibilities. That is approximately like asking what is the probability that any random person in the world is your close relative. Then asking between your aunt and another person chosen at random, which is the closer relative. In both cases the most likely answer should be obvious. There is a small but finite chance the other person is your mother and therefore a closer relative, but would you bet on it? Similarly, would you bet that all ones was the result of tossing the die?
Now when all is said and done, suppose you watched a person throwing a die 19 times and it came up one every time. What would you bet the next toss will be?