Populations and Samples
Collecting data correctly is just as important as analyzing it accurately. Let’s say that we are interested in determining the average tail length of Golden Retrievers, how would we collect this data? It’s quite obvious that it would near impossible to measure every Golden Retriever’s tail to get an average. Instead, it make more sense to measure the tails of a smaller group of Golden Retrievers and use that average as a estimate of the tail length of the entire breed.
In order to draw conclusions about a complete collection of individuals, which is the population, we study a subset of the population, called a sample.
The reason for collecting the data can also create this distinction. For example, data collected from all Golden Retrievers in California are sample data when we use them to represent a larger collection such as all Golden Retrievers in the United States, but is population data if we only care about describing the Golden Retrievers in California.
Examples of Populations and Samples
Section titled “Examples of Populations and Samples”| Population | Sample |
|---|---|
| GDP of each country in the world | GDP of each country in Asia |
| All undergraduate students in the United States | Undergraduate students at Harvard University |
| Every earthquake worldwide in the past 100 years | The 20 strongest earthquakes recorded |
| Patients with diabetes in a particular hospital | 50 diabetic patients from the hospital chosen for a clinical trial |
Notice that every observation in a sample is also part of the respective population.
Population Parameters and Sample Statistics
Section titled “Population Parameters and Sample Statistics”We can collect information from our data that describe measures of center and measures of spread.