Empirical Rule Definition
The empirical rule refers to a statistical rule that mentions that all data or information is covered around three standard deviations of the average in a normal distribution. It states that the first standard deviation covers 68% data, the first and second standard deviation covers 95% data, and all the three deviations of the average in normal distribution cover 99.7% data.
The empirical rule is also known as the three-sigma rule, that is 68-95-99.7 rule.
A Little More on What is the Empirical Rule
The subject of statistics uses the empirical rule for predicting final results. This rule helps in having an idea of the ultimate result of the related data before the collection of data, and after the calculation of standard deviation is done. This probable data is usually used in the mid as it may require lots of efforts of the researcher to collect accurate information or data. It is used to ascertain the normality of a distribution. When there are lots of data points that don’t fall in the first three standard deviations, it informs about the abnormality of distribution.
Examples of Empirical Rule
Let’s say the population size of wild animals in a zoo is said to be normally distributed. On an average, one animal has a lifespan of 13.1 years, and the standard deviation is 1.5 years. For determining the probability of a wild animal living more than 14.6 years, one can consider using the empirical rule. As mentioned earlier, the average of distribution being 13.1 years, there will be following age ranges determined for every standard deviation.
One standard deviation: (13.1 – 1.5) to (13.1 + 1.5), or 11.6 to 14.6
Two standard deviations: 13.1 – (2 x 1.5) to 13.1 + (2 x 1.5), or 10.1 to 16.1
Three standard deviations: 13.1 – (3 x 1.5) to 13.1 + (3 x 1.5), or, 8.6 to 17.6
Here, one has to find out the total probability of the wild animal having a lifespan of at least 14.6 years. As per the empirical rule, the first standard deviation covers 68% of the normal distribution, and in this example, it is in a range between 11.6 to 14.6. Hence, the remaining 32% falls beyond this calculated range. Half of the data is over 14.6 and the other half is under 11.6. Hence, one can calculate the probability of wild animal living beyond 14.6 years is 16%.
Let’s take another example to have more clarity about this topic. The mean age of an animal kept in a zoo is 10 years, and the standard deviation is 1.4 years. If a zoo manager tries to find out the probability of an animal who can live at least 7.2 years, the distribution range will be as follows:
One standard deviation: 8.6 to 11.4 years
Two standard deviations: 7.2 to 12.8 years
Three standard deviations: 5.8 to 14.2 years
As the empirical rule suggests, the first two standard deviations cover 95% of the distribution. That means the remaining 5% falls outside these two deviations, 50% over 12.8 and the remaining 50% under 7.2 years. Hence, the probability that the animal would live beyond 7.2 years is:
95% + (5% / 2) = 97.5%