You are doing some statistical analysis (I know we all do at some point, don’t we?). You want to have a fair sampling, so there is no bias in your results.
You start with all the known factors that could have an impact on your study, and try to split them as fairly as you could between the different test groups, so each of them have a “similar mixed” representation (whatever that really means). You spent several hours doing so (you are a really really fair guy after all) and finally you find and arrangement that looks pretty good.
Then that weird feeling starts growing inside you, a cold sweat appears and your mount suddenly get dry…
What about the factors you don’t know about?
Are they fairly split through your samples?
Would they influence your results?
How can you be sure?
The answer to this seemingly impossible conundrum is easier than expected! RANDOMIZE!!!!!
Key insight: How to insure your sample is fairly split amongst the different factors that can influence your test! Randomize!!!
Randomizign your samples will almost insure the mix obtained in each set is similar, allowing you to get control over the factors and evenly share the impact that those factors could have on your test results , and that’s true even for the unknown ones.!
So we can’t know it all, but if we join forces with randomness we can control it almost all.
Final Thought: Random is chance, randomess won’t insure your samples get correctly mixed everytime, but statistically the chances are on you side
“The bigger problem is that factors we can’t anticipate or control may matter, and we’d like them scattered evenly between the two treatment groups. If we knew what the factors were, we could assure that they’re evenly split between the groups. The hope is that randomization will do that for us with things we’re unaware of. ”