# Basics of Statistics – Sample Vs Population Data

Share
Viewer Rating
In statistics, a sample is a part of a population and a population is a whole; it’s every member of a group.
Sometimes it’s not possible to survey every member of a group. So we need to collect the output and conclude based on sample data. If you go into a chocolate store, the owner might have samples of their products on display. It wouldn’t be possible for you to sample everything in the store; Financially the owner wouldn’t want you to taste everything for free. And you probably wouldn’t want to eat a sample of chocolate from a couple hundred jars or you might get sick to your stomach. So, you might base your opinion about the entire store’s candy line based on the samples they have to offer. The same logic holds true for most surveys in stats; You’re only going to want to take a sample of the whole population (“population” in this example would be the entire candy line). The result is a statistic about that population. Rarely, can we collect data on ALL members or items of a population because of the serious factor of time and expense. But we still need to be able to make conclusions about the entire population. Few things to keep in mind:
• All elements in a sample must also by definition be part of the population as it is defined.
• The sample should be representative of the population from which it is drawn.
• In almost all cases, samples from the same population should be independent of each other.
• A sample is always an approximation of the population.
• Sample data sets always have error built into them; uncertainty.
• Is the sample a good approximation of the population?

Population standard deviation : σ = sqrt[ Σ ( Xi – μ )2 / N ]The symbol ‘σ’ represents the population standard deviation. The term ‘sqrt’ used in this statistical formula denotes square root. The term ‘Σ ( Xi – μ )2’ used in the statistical formula represents the sum of the squared deviations of the scores from their population mean.Sample Standard deviation: s = sqrt [ Σ ( xi – x_bar )2 / ( n – 1 ) ]The term ‘Σ ( xi – x_bar )2’ represents the sum of the squared deviations of the scores from the sample mean. A measurable characteristic of a population, such as a mean or standard deviation, is called a parameter; but a measurable characteristic of a sample is called a statistic.A sampling method is a procedure for selecting sample elements from a population. Simple random sampling refers to a sampling method that has the following properties:
• The population consists of N objects.
• The sample consists of n objects.
• All possible samples of n objects are equally likely to occur.

When a population element can be selected more than one time, we are sampling with replacement. When a population element can be selected only one time, we are sampling without replacement. I hope you liked my post. I shall explain the concept of variables in my next post. Stay tuned 🙂 IT engineer, Machine Learning enthusiast and aqua hobbyist.
1. 