Entretien de Data Scientist

# How would you explain a confidence interval to a

non-technical audience?
When we want to generalize a statement for a big population, it's impossible to get the data from all the members. So we take a sample of the population and use the calculated sample parameter as a proxy for the population parameter.

DS_B, le 6 févr. 2018
When we want to generalize a statement for a big population, it's impossible to get the data from all the members. So we take a sample of the population and use the use sample statistic to estimate the population parameter.

But different samples have different sample statistics. Which one is a good approximate for the population parameter?

So we should come up with an interval to confidently say that the population parameter lies in it.
This interval is called Confidence Interval.

DS_B, le 6 févr. 2018
Take yourself back to high school. You were out sick one day. That day you missed a massive food fight. You show up to school the next day and you want to know who or what caused the fight. Now imagine two scenarios:

A) you talk to 100 people and their stories are pretty much aligned. It was Bill who started it
B) you talk to 10 people and each one identifies a different person

There is less certainty in situation B. This is clear.

This is the same thing with confidence intervals. When we estimate the features of a population (a person's entire bloodstream, the nation's voting habits, e.g.) by using a sample (a blood test, a survey, e.g.) our confidence in that estimate is a function of sample size and standard deviation (i.e. how much the information changes/varies).

Brendan, le 16 févr. 2018
It is a range of values where any sample value are likely to fall into with certain probability. It is calculated based on some sample from the entire population.
For example, we want to figure out the average height of women in the U.S.. Assume someone tell you that the 95% confidence interval is (5’2, 5’7), that means if we randomly pick one woman from the crowd, there is 95% chance that the height of this women is between 5’2 and 5’7. Or in other words, if we randomly pick 100 woman from the crowd, we are confident that the height of at least 95 of them are between 5’2 and 5’7.

CathyQian, le 10 avr. 2018

