You go to the doctor for a quick check up. The doctor tells you to take a test on a very rare illness (1 in a million of the population have it, based on experimental and historical data) and so he handles over a test in a box. In the box it says that the test has 99,99% of accuracy.
You take the test and it is positive. So you definitely have that rare condition, right?
What is the actual probability you do?
The problem above it is one of the most counterintuitive I have heard, and at the same time is one of the most fundamental and powerful results you may encounter constantly on your day to day life. It’s the Bayes Theorem.
The (surprising) actual answer to the question above is that in that situation you have around 1% of probabilities of having the disease. Yes 1% (is not a typo). How can that be?
The actual reasoning behind it is not difficult. A test with 99,99% accuracy will (in average) give you 9999 correct results out of every 10.000 tests or 1 false positive every 10.000 tests, and this is the key to understanding Bayes. If the rare condition you are testing is 1 in a 1.000.000 (as in the situation above), what would be the expected results of the test if you test a milion individuals?
The test is expected to give 1 false positive every 10.000 so, you end up with ~100 false positives out of a 1.000.000 tests! That makes 101 positives in total (adding the expected 1 that actually have the disease )
This is: 100 false positives and 1 real positive. So once you know you are one of those 101 positives, your chances of being the real positive is 1 in 101, or ~1%
Unbelievable, right? But how? This is a much more interesting question…
The reason underlying that seemingly non sensicle result is the rarity of the event we are testing. More precisely the comparison between the rarity of the event and the precision of our test. Even when a 99,99% (order 2) precision may seem pretty good for our day to day operations, the rarity of the disease (1:1.000.000 – order 6) it is way too high for our test precision.
Making sense of the chances:
- 20 consecutive coin tosses all coming Tails up ~ 1:1.000.000 chance… It is almost impossible (See how big is a million)
- 14 coin tosses all coming Tails up ~ 1:16.000 chance… Still very rare!
- 10 coin tosses all coming Tails up ~ 1:1000… I have actually seen this done in this video from SingingBanana
The Extra Mile
1:1.000.000 is indeed a very very rare event (It was only used here to exemplify the point) but this effect happen even with much less rare events.
What about a 1:10.000 event (still very rare)and a 99.99% test? It turns out a positive test result, actually means a 50/50 chance of actually having the disease, as the expected result is that in 10.000 tests there will be 1 false positive and 1 real positive.
NOTICE: In Europe a rare disease is that which affects less than 1:2000 people.
Source: What is a rare disease
And in a disease afecting 1:1000? Here we seem to get closer to our test efficiency with a 90,9% (10/11) of having the disease if we test positive… But please notice that we are still quite far from the test “promised” accuracy by a factor of 100!
1:100 disease tested with a 99,99% makes it up to 99% success rate , and 1:10 disease 99,9% (still 10 times worse than advertised).. And all this with an almost impossible test of 99,99% accuracy! (A lot of tests don’t even get to 99%…)
So are we doomed? Can’t we trust our tests? What’s the point of testing!
Well the solution is easy. In fact very easy: Repeat the test! (Independently)
If we do a second test on our original problem and we get another positive result, what are the chances then?
Here we are asking how likely it is that I have a 1/1.000.000 disease if a have 2 99.99% independent tests giving me a positive result (Remember that with only 1 test my actual chances were ~ 1%)…And the answer to this question turns out to be around 99%!
One quick way of seeing why the probabilities increase so much (from 1% to 99%) is to realise that the question you are asking is that if you pick those 101 positives from the first test plus another 9899 other people, and add them to another round of 10.000 tests, how likely is that the false positive that will likely come out of that second round will fall in one of the original 100 people… Not likely at all! in fact is easy to see it is 1% (think of a lottery with numbers from 1 to 10.000 and you buy 100 tickets), so if you actually have 2 positives the chance that they are 2 false positives is 1%… So you are probably (99%) not a false positive…
Final Thought : real test accuracy does not only depend on the design and make of the test, but in the event it is testing