Although probability is a concept that is used pretty regularly in everyday life and across different domains, I think that there are some fundamental aspects of it that can be a bit confusing.
Take a common probability question:
There are two children and at least one of them is a boy. What is the probability that the children are two boys?
We’ll assume children are born as boys or girls with equal probability. And the answer is 1/3, or about 33%. But let’s make a small change to the question.
There are two children and at least one of them is a boy born on a Tuesday. What is the probability that the children are two boys?
And we’ll assume children are born on each of the days of the week with equal probability. The answer here is 13/27, or about 48%.
These results do seem prob1paradoxical, and cause some questions about the nature of probability to come to mind.
Question 1: Why does learning new information change the probability associated with a situation?
Imagine that you were listening to a recording that was telling you these puzzles, and you got distracted near the end of the sentence, missing that the boy was “…born on a Tuesday”. It appears that you deciding not to listen changes the answer from 48% to 33%. Why?
Let’s use another situation as an analogy to give a new perspective on the question. You watch a video of someone rolling a ball across a table. Now, you learning new information about the experiment is not going to change anything about that system (e.g. the ball’s speed) since the event already happened. It could be that learning more about the experiment (e.g. how hard the person rolled the ball) helps you personally create a better guess at what the speed is. But in reality, the speed is still some unchanging value, even if you don’t know it. So why is it different with probability?
The above scenarios deal with you (the observer) having different levels of information about systems. When we have perfect information about a system, we are able to predict everything that happens. Then the probability of any aspect of the system becomes 0% (if it’s incorrect) or 100% (if it’s correct).
Take the case of flipping a coin. After the coin flies in the air, the laws of physics act in predictable ways to guide the coin as it falls. If we knew all the information associated with the system, we could theoretically apply physics knowledge to determine how the coin would land. Even before it hit the ground, we could state that the probability of it landing Heads was 0% or 100%.
However, in practice, we don’t know how hard the person tossed the coin, or how heavy the coin is, or which way the wind was blowing. (The question “What is the probability of a coin landing Heads?” does not give any of those pieces of information.) Due to our limited knowledge, we have to neglect all of that. Thus, we have to stick with the knowledge we have — that there are 2 sides to the coin that are relatively indistinguishable and equally likely, and thus call the probability of the coin landing heads as 1/2 = 50%.
(Note: If we followed a non-deterministic theory like quantum mechanics, we would have to consider that even while knowing all the information pertaining to a system state (e.g. coin mass, wind, etc.), we would not be able to perfectly predict the result due to some sort of randomness inherent in the universe. But for the purposes of this article, we’ll stick with deterministic interpretations (classical physics and Newtonian laws), meaning that events are caused solely by previous states of the world.)
As we can see, for both physical variables (e.g. speed) and probability, there is some objective truth “out there”. When a ball is rolled across a table, it rolls at some speed (e.g. 1 m/s). This is true regardless of whether you knew that — or instead thought (based on your information) that it rolled at a different speed (e.g. 2 m/s). When a coin is flipped, even before it lands, the probability that the result is Heads is determined at some value 0% or 100% (let’s say 100%). This is again true regardless of whether you knew that — or instead thought (based on your information) that the probability was 50%.
The important difference between probability and physical variables (like speed) is that probability is generally only a useful concept in systems where you lack information. Questions regarding physical variables will often give you all the information you need (perfect information) to solve the problem, and the expected answer is therefore the “actual” answer. However, questions regarding probability will generally give you limited information, meaning that there will be aspects of the scenario that you cannot predict, making the expected answer based on your limited knowledge. If probability questions gave perfect information, they might look something like “There is one child, and he is a boy. What are the odds he is a boy?”
The concepts illustrated here are all valid and internally consistent. It’s just that it makes more sense to use the concept of “probability” when we are lacking knowledge about the situation — that way, it represents our degree of certainty that an event will happen. If we knew everything, we would already know what would happen and our probability would be 100% (or 0%) all the time.
That seems to make sense. We can see that it is actually intuitive to us that the way we use the term “probability” inherently has a subjective component — it is heavily reliant on your perspective and the information that you hold. When you get more information pertaining to a situation, the probability associated with the situation changes. So the answer to “Why does learning new information change the probability associated with a situation?” Because that’s how “probability” is used and defined — it is partially based on your perspective and what knowledge you have.
As an example, we can take the case of the coin flip again, and have us flip the coin without looking at the result. What is the probability that it is Heads? Well it’s 50%, since all we know is that there are 2 equally likely possibility. Now let’s look at the coin and see the result — aha, it’s Heads. Now what is the probability that it is Heads? Well, it’s 100%.
However, in this example, the link between the new information (e.g. looking at coin) and the change in probability (e.g. 50% heads -> 100% heads) seems fairly obvious. From our original case, the link between there being a boy being born on Tuesday and there being 2 boys doesn’t seem as intuitive. This still leaves us to wonder:
Question 2: Why does learning about seemingly irrelevant information change the probability associated with a situation?
This will be addressed in a future article!