What is happening by pure chance?

What is happening by pure chance?

“Moreover, we should expect, by chance alone, about 1 in 10 of the CEOs to have five winning or losing years in a row.” from “The Drunkard’s Walk: How Randomness Rules Our Lives” by Leonard Mlodinow
http://a.co/7h4baBy

Let’s say you want to hire the best CEO for your company, so you start following the yearly results of the top Fortune 500 companies. After 5 years you shortlisted 50 of them from the 50 companies that have had 5 successful years! Impressive right?
So your man is one of those clever successful CEOs and your company is off to a brilliant start… Or isn’t it?
Before you hand over a multimillion contract to someone and start dreaming on your future house in the Bahamas, allow me to do some numbers… It will be quick, promise!
Continue reading “What is happening by pure chance?”

Advertisements

How to control the unknown?

How to control the unknown?

You are doing some statistical analysis (I know we all do at some point, don’t we?). You want to have a fair sampling, so there is no bias in your results.

You start with all the known factors that could have an impact on your study, and try to split them as fairly as you could between the different test groups, so each of them have a “similar mixed” representation (whatever that really means). You spent several hours doing so (you are a really really fair guy after all) and finally you find and arrangement that looks pretty good.

Then that weird feeling starts growing inside you, a cold sweat appears and your mount suddenly get dry…

What about the factors you don’t know about? 

Are they fairly split through your samples? 

Would they influence your results? 

How can you be sure?

The answer to this seemingly impossible conundrum is easier than expected! RANDOMIZE!!!!!


Key insight: How to insure your sample is fairly split amongst the different factors that can influence your test! Randomize!!!


Randomizign your samples will almost insure the mix obtained in each set is similar, allowing you to get control over the factors and evenly share the impact that those factors could have on your test results , and that’s true even for the unknown ones.!
So we can’t know it all, but if we join forces with randomness we can control it almost all.


Final Thought: Random is chance, randomess won’t insure your samples get correctly mixed everytime, but statistically the chances are on you side


“The bigger problem is that factors we can’t anticipate or control may matter, and we’d like them scattered evenly between the two treatment groups. If we knew what the factors were, we could assure that they’re evenly split between the groups. The hope is that randomization will do that for us with things we’re unaware of. ”
https://www.johndcook.com/blog/2016/02/01/reproducible-randomized-controlled-trials/

Rare events and tests

Rare events and tests

You go to the doctor for a quick check up. The doctor tells you to take a test on a very rare illness (1 in a million of the population have it, based on experimental and historical data) and so he handles over a test in a box. In the box it says that the test has 99,99% of accuracy. 

You take the test and it is positive. So you definitely  have that rare condition, right?

What is the actual probability you do?

The WHY

The problem above it is one of the most counterintuitive I have heard, and at the same time is one of the most fundamental and powerful results you may encounter constantly on your day to day life. It’s the Bayes Theorem.

The (surprising) actual answer to the question above is that in that situation you have around 1% of probabilities of having the disease. Yes 1% (is not a typo). How can that be?

The actual reasoning behind it is not difficult. A test with 99,99% accuracy will (in average) give you 9999 correct results out of every 10.000 tests or 1 false positive every 10.000 tests, and this is the key to understanding Bayes.  If the rare condition  you are testing is 1 in a 1.000.000 (as in the situation above), what would be the expected results of the test if you test a milion individuals? 

The test is expected to give 1 false positive every 10.000 so, you end up with ~100 false positives out of a 1.000.000 tests! That makes 101 positives in total (adding the expected 1 that actually have the disease ) 

This is: 100 false positives and 1 real positive. So once you know you are one of those 101 positives, your chances of being  the real positive is 1 in 101, or ~1%

Unbelievable, right? But how? This is a much more interesting question…

Continue reading “Rare events and tests”

The importance of understanding what’s being asked

The importance of understanding what’s being asked

“My dad heard this story on the radio. At Duke University, two students had received A’s in chemistry all semester. But on the night before the final exam, they were partying in another state and didn’t get back to Duke until it was over. Their excuse to the professor was that they had a flat tire, and they asked if they could take a make-up test. The professor agreed, wrote out a test, and sent the two to separate rooms to take it. The first question (on one side of the paper) was worth five points. Then they flipped the paper over and found the second question, worth 95 points: “which tire was it?”
What was the probability that both students would say the same thing?My dad and I think it’s 1 in 16. Is that right?
No, it is not: If the students were lying, the correct probability of their choosing the same answer is 1 in 4”

from “The Drunkard’s Walk: How Randomness Rules Our Lives” by Leonard Mlodinow
http://a.co/dCzpBt4


The WHY

What seems counterintuitive or paradoxical about this is, as usually is in probability riddles, confusing the answer from a more “natural” or “common” question with the answer for our actual enquiry.

In this piece we are actually asking “what is the probability that 2 people choose the same random choice from 4 options independently” which is 1/4 , which differs from the more usual question we may wrongly assume like “what is the probability 2 guys choose randomly the correct option out of 4 possible choices” which comes at 1/16 (remember that in the story the boys are lying so there is no actual “correct” answer, any would do).

When stated like this it is easy to see where the key difference lies (and why the answer 1/4 is not paradoxical at all): it lies in the fact that in the first question all 4 choices could be correct (or lead to a successful outcome) while in the second only 1 choice leads to a successful outcome, so in the former no matter which actual choice is made (It only matters that the choices are the same), while in the latter it does (they need to be the same and the correct one on top)


THOUGHT: It is not paradoxical that 2 different questions have 2 different answers!


The Extra Mile

It is interesting to think as well what would be the perception from the point of view of the teacher. He (unlike us, the readers) does not know if the kids are lying or not (that is actual thing he is trying to find out). So, the relevant question for him could be, what is the probability they are lying (or telling the truth) given their answers?
Continue reading “The importance of understanding what’s being asked”

Monty Hall

The story as it goes is well known: You are in a TV game, the host (called Monty) tells you the rules…

“There are 3 doors, one holds a flashy car, the other two goats… You may choose wisely…”

Easy… You have 1 in 3 options to win the price. You look at the doors, all look the same, no noises, no clues… Monty is not helping you… So you choose…done!But this not all… Cheeky old Monty has a final twist up his sleeve… He goes an opens one of the doors you didn’t choose only to reveal a goat, and then asks:

“Last chance! Do you stick or swap doors?”

So what do you do?

Which option does give you more chances to win? Or does it not matter? Think about it, and write it down…

Continue reading “Monty Hall”