Inferential Statistics
The methods of inferential statistics are used to make predictions and estimates:
1. Predictions about future trends, assuming that current trends continue
2. Estimates of the characteristics of a whole population, based on data collected from a representative sample.
Probability theory is used along with inferential statistics to set limits on the accuracy of any predictions or estimates.
The first part of a weather report is typically devoted to describing the weather events that have already occurred today – sunshine, rain, winds, or storms (descriptive statistics). The second part of a weather report is typically devoted to predicting the weather for tomorrow, and perhaps the rest of the week (inferential statistics). For example, ‘there is a 40% chance of rain by the end of the week.’
Random samples
In many situations it is difficult, or impossible, to gather information from every member of a population. In a pre-election poll it would be very expensive and time consuming to ask every voter in the country for his or her political preferences, and then repeat the survey every week during an election campaign. If you were testing the crash worthiness of automobiles, it would be pointless to crash-test every car to make sure it was safe. The concept of a sample is fundamental to inferential statistics. A sample is a collection of items that is selected at random to represent a whole population. In a pre-election poll, perhaps 1000 voters from across the country are asked for their political preferences. Perhaps fewer than ten cars of a production model are crash tested each year to demonstrate the safety features of that model.
The notion of random selection is a key feature of a sample. A ‘random sample’ implies that there is no bias in the selection of the sample items; every member of the population has an equal chance of being selected. If you asked people applying for hunting licenses for their views on gun control, their replies would probably not be representative of the whole population.
Limits on predictions
When inferential statistics are used to make a prediction or an estimate, the results are likely to be more accurate when they are based on a larger random sample and when the data collected is more consistent.
Calculating the limits on the accuracy of a prediction or estimate can be a bit complicated. For example, a poll of the Canadian electorate in March 2007 found that the Conservative Party had the support of 38% of the voters. This number was followed by the phrase “within 3 percent, nineteen times out of twenty”. Perhaps only 1000 out of twenty million potential voters were sampled. Data from the 1000 responses would then have been used to construct a model of the whole population.
Finally, with the aid of probability theory, it was estimated that if the same poll were repeated twenty times, only once would the result indicate that support for the Conservative Party was outside the range 38% ± 3%, or from 35% to 41%.
A basic understanding of statistical methods has at least two distinct advantages.
First, it is very useful to be able to grasp and evaluate what others are describing with statistics. Second, when used properly, statistical methods provide clear information, with known ranges of errors, about the world we live in. Statistical results should be cherished when they are based on objective data gathering and analysis.