Confidence Interval Definition
A confidence interval shows a range of values within which an unknown population parameter lies. Loosely speaking, it shows the confidence that one has that a given population parameter lies within a given range. For instance, one can say, “I am confident that 95 percent confident that between 45 and 56 percent of Americans love cooking their food more than eating out.”
A Little More on What is a Confidence Interval
A confidence interval does not show the true value of a population parameter but a range of values within which the true value lies. If, for instance, a student wants to find the percentage of Americans who love hockey, a random sample will be taken and the participants asked whether they love hockey. Since the sample is random, the CI constructed from the data will be random.
The most common CI is 95 percent but others, such as 99 and 90 percent are used. The higher the CI, the broader the range of values and the lower the CI, the narrower the range of values. When constructing CI range, one should consider the size of the sample they choose and the variability of the sample and the CI chosen.
The introduction of CI into statistics in 1937 is credited to Jerzy Neyman.
Interval estimation is not the same as point estimation. While a point estimate is an estimate of a population parameter of interest, for instance, the mean of a given quantity, interval estimate specifies a range within which a population parameter is estimated to lie. CI is usually reported in graphs or tables together with point estimates to show the reliability of the estimates.
For instance, a confidence interval is useful in determining the reliability of survey results. Consider election-voting intentions where 40 percent of the respondents want to vote for a certain party. Using a 99 percent confidence interval for the whole population, the results may be between 30 and 50 percent. If the same data is used with a 90 percent confidence interval, the results may be between 37 and 43 percent. The size of the used sample determines the length of the CI.
Meaning and interpretation
There are different interpretations of confidence interval taking an example of 90 percent CI.
- CI is expressed as samples or repeated samples. If this procedure was to be reused on different samples, fraction of calculated CI that shows the true population parameter tends towards 90 percent.
- CI is expressed as a single sample. In this case, there is a 90 percent probability that calculated CI as from future experiment shows the true value of an unknown population parameter. This statement is more about probability than confidence interval. The statement considers the probability associated with CI.
- The CI can also be said to represent values of population parameter in which the difference between the observed estimate and the parameter is not significant statistically at 10 percent level.
For each of the above, if the true value of the parameter falls outside the 90 percent CI, it shows that there is a sampling event has happened, which in this case is a point estimate, and which had a probability of 10 percent or less of happening.
References for Confidence Interval
Academic Research on Confidence Interval
- Pulling cost‐effectiveness analysis up by its bootstraps: a non‐parametric approach to confidence interval estimation, Briggs, A. H., Wonderling, D. E., & Mooney, C. Z. (1997). Health economics, 6(4), 327-340. This paper describes the development of non-parametric bootstrap CI. It shows that the benefit of creating such CI is that they do not rely on parametric assumptions. The paper uses data from a clinical trial where they apply the description of the non-parametric bootstrap to show the strengths and weaknesses of the model.
- Statistics: An introductory analysis, Yamane, T. (1973). This paper examines the confidential studies performed in Ghana in relation to its rapid population growth, economic expansion and use of birth control. With a sample of 120 women, the paper tests the awareness of birth control methods and the factors that influence their use. It concludes that age plays an important factor in determining the awareness of birth control methods.
- Including systematic uncertainties in confidence interval construction for Poisson statistics, Conrad, J., Botner, O., Hallgren, A., & de Los Heros, C. P. (2003). Physical Review D, 67(1), This paper offers a simple model of developing CI for Poisson Statistics where systematic uncertainties are applied.
- A goodness‐of‐fit approach to inference procedures for the kappa statistic: Confidence interval construction, significance‐testing and sample size estimation, Donner, A., &Eliasziw, M. (1992). Statistics in medicine, 11(11), 1511-1519. This paper offers a new procedure of developing a CI on the kappa statistic in the case of a dichotomous outcome and two raters. The suggested procedure is based on chi-square and it is used on clustered data. According to the author, the procedure is accurate smaller size samples than those used in other CI procedures. The procedure is also involved in the use of significance-testing and planning of sample size requirements.
- A confidence interval for the median survival time, Brookmeyer, R., & Crowley, J. (1982). Biometrics, 29-41. This study develops a nonparametric asymptotic CI for median survival time in a case where data is subject to censoring. A simulation study is also done to test this nonparametric CI and it shows that it performs well in different underlying survival functions.
- A spreadsheet for deriving a confidence interval, mechanistic inference and clinical inference from a P value, Hopkins, W. G. (2007). Sportscience, 11, 16-21. This paper offers a simple method of constructing confidence interval, mechanistic inference and also clinical inference using a P value.
- Structural model evaluation and modification: An interval estimation approach, Steiger, J. H. (1990). Multivariate behavioral research, 25(2), 173-180. This paper analyzes D. Kaplan’s proposal to extend the procedure of evaluation and sequential modification of structural models post hoc. The paper illustrates the dangers of using post hoc modification without statistical protection. It also addresses issues raised in using PH as an analysis tool.
- Confidence intervals rather than P values: estimation rather than hypothesis testing.,Gardner, M. J., & Altman, D. G. (1986). Br Med J (Clin Res Ed), 292(6522), 746-750. This paper looks at the benefits of using confidence intervals instead of P values. It shows that estimation using CI is better than hypothesis in P values. The paper also gives suggestions for the types of graphical displays. It shows that CI, when appropriate should be used for major findings in the abstract and main text of the paper.
- Simplified statistics for small numbers of observations, Dean, R. B., & Dixon, W. J. (1951). Analytical Chemistry, 23(4), 636-638. This paper shows the importance of confidence intervals when only observation data is available. It explains the different ways in which observation data can be used to construct CI and how CI is better than other estimation methods.
- Constructing confidence sets using rank statistics, Bauer, D. F. (1972). Journal of the American Statistical Association, 67(339), 687-690. This paper shows systematic procedures which can be used to construct confidence bounds and point estimates using rank statistics.
- How to obtain the confidence interval from a P value, Altman, D. G., & Bland, J. M. (2011). This article is a guide on how to construct CI from a P value. It starts by explaining the difference between P value and CI and how the two are related.