E-Mail:

A Priori Statistics To A Posteriori Probability

Definitions are important. Much confusion is avoided if we agree on the meaning of the words we use. I try to be more careful in my usage when writing than in casual conversation, but maintaining one hundred percent accuracy is more difficult that maintaining one hundred percent precision. (What would it mean to be 110% accurate?)

Even though I have ranted about the differences between “theory,” “hypothesis,” and “conjecture,” I have been occasionally guilty of equally bad infractions by failing to observe the difference between “statistics” and “probability,” In common speech, these two terms are often used interchangeably, but they refer to different things.

Statistics is essentially the study of things past. What was the ratio of boy births to girl births at a hospital last year? What fraction of the American population was computer literate in 2006? These things are measured and compiled. They are often used to predict future events by attributing the probability of future births or computer students to the ratios found in gathering statistics.

Probability is essentially a prediction of things to come. A fair die is thrown. The probability that any given number comes up is 1/6. This is derived from the nature of a cube and first principles.

Of course we could check the accuracy of the probability calculation by throwing the die many times and tabulating the frequencies with which each number comes up. That is, we could take the statistics of the operation of throwing the die and compare it to the computed probability. This might lead us to re-assign probabilities if we find that one side comes up more often than expected.

But how do we decide when we have gathered enough data to justify modifying our prediction of the probabilities? That is one of the essential problems of decision theory. How do we make the transition from a priori statistics to a posteriori probability?

This transition is particularly important in predicting the probability of the occurrence of costly rare events such as a hurricane the strength of Katrina or a devastating tornado. The techniques of profiling for suspected terrorists depend on similar considerations. As different as you might think that predicting a hurricane is from identifying a potential terrorist, the underlying mathematics is similar in that both are rare events with a large penalty for missing true events and a relatively small penalty for a false alarm.

But you see how the language can trap us. What do I mean by “rare event,” “large penalty,” and “small penalty?” Evaluating the value of large and small is critical in deciding what to do. Consider a wager proposed to you by a wealthy, but eccentric, individual. He offers to flip a fair coin. You bet a dollar. If it comes up heads, you win $1.10. If it comes up tails, you lose your dollar. That sounds like a no-brainer. The odds are in your favor. You take it. But now suppose he changes the rules and requires you to bet $1,000,000 of your own money (no sharing or syndicates allowed) with a payoff of $1,200,000 if you win. The odds are even better than the $1.00 no-brainer. Would you make the bet?

As to rare events. Suppose the bet is changed so that if he throws ten heads in a row, he pays you 1200 times your bet of $1.00. Again, that is a no-brainer since you know that 2 to the tenth power is only 1024. So the fair payoff would only be #1024. If you lose, what’s a dollar? If you win, you win big, and the odds favor winning.

Now he gets really crazy and offers to pay 12,000 times your initial bet under the same terms, but you must put up a bet of $1,000,000 of your own money. You might be able to raise the bet money if you sell everything you own and borrow everything you can from friends and relatives and you max our your credit cards. The odds are extremely biased in your favor. Do you take the bet?

In response to the interest my original tutorial generated, I have completely rewritten and expanded it. Check out the tutorial availability through Lockergnome. The new version is over 100 pages long with chapters that alternate between discussion of the theoretical aspects and puzzles just for the fun of it. Puzzle lovers will be glad to know that I included an answers section that includes discussions as to why the answer is correct and how it was obtained. Most of the material has appeared in these columns, but some is new. Most of the discussions are expanded compared to what they were in the original column format.

[tags]statistics, probability, decision theory, puzzle[/tags]

What Do You Think?

 

Want to Start a Blog Here for Free?

Are you an expert in one subject or another? If your goal is to help others and dispense your hard-earned information back to the community, get involved in our community site today! You can write about anything - no matter the topic. Exceptional candidates will be offered the chance to contribute to (and generate revenue from) the main Lockergnome site. Join us today!

GnomeREPORT - Nov 21, 2008

Router Report

Tips - Nov 17, 2008

Blogging - Finding The Time

Business, Resources - Nov 14, 2008

FierceCIO

71 queries / 0.260 seconds.