Linear Regression Definition
This is known in statistics as a linear approach to a scalar response’s relationship with a single or multiple explanatory variables. Where one variable is involved, this approach is known as a simple linear regression and referred to as a multiple linear regression if multiple variables are included. However, it is very separate from the multivariate linear regression that predicts multiple correlated dependent variables instead of a single scalar one.
A Little More on What is Linear Regression
Linear regression models the relationships using linear predictor functions which have unknown model parameters derived from the data and referred to as linear models. Usually, the response’s conditional mean is assumed to be an affine function of the values of the explanatory variables, but occasionally some quantile or the conditional median is used.
Linear regression majors on the response’s conditional probability distribution which is based on the values of the predictors instead of the overall variable’s joint probability distribution that is a part of the multivariate analysis.
Linear regression has been critically analyzed and applied widely in practical applications. This results from the fact that models that linearly depend on their unknown parameters can be fitted more simply than those non-linearly dependent on their parameters. Another reason is that the statistical properties of the estimators got can be readily determined.
Majority of the various applications of linear regression are classified under the following two groups:
In case of an objective of forecasting or error reduction, linear regression can be utilized to align with a predictive model to observed values data set of the response and explanatory variables. After the model is created, it can be used to predict the response if the additional values of the explanatory variables are collected without an associated response value.
In the case of an objective of explaining the variation in the response variable that is related to variation in the explanatory variables, this regression can be utilized to compute the strength of this relationship. It may also be used to assess if various explanatory variables have no linear relationship with the response or find out the subset of explanatory variables that have redundant information about the response.
The least squares approach are usually used to fit the linear regression models although other ways may be used such as minimizing the lack of fit in some norm or reducing a penalized version of the least squares cost function like in ridge regression and in the lasso. The least squares approach can also fit non-linear models.
References for Linear Regression
Academic Research for Linear Regression
- Computational tools for probing interactions in multiple linear regression, multilevel modeling, and latent curve analysis, Preacher, K. J., Curran, P. J., & Bauer, D. J. (2006). Journal of educational and behavioral statistics, 31(4), 437-448. This article examines various methods utilized in probing the interaction effects and also describes a collection of resources that can be used by researchers to get significance tests for simple slopes, calculate regions of significance and get confidence bands for simple slopes.
- Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models, Leggetter, C. J., & Woodland, P. C. (1995). Computer speech & language, 9(2), 171-185. This study presents a technique of speaker adaptation for continuous density hidden Markov models and adapts a primary speaker-independent system to improve the modeling of a new speaker through updating the HMM parameters.
- Detection of influential observation in linear regression, Cook, R. D. (1977). Technometrics, 19(1), 15-18. This paper shows the development of a measure based on confidence ellipsoids used to judge the contribution made by each data point to determine the least squares estimate of the parameter vector in linear regression models.
- The collinearity problem in linear regression. The partial least squares (PLS) approach to generalized inverses, Wold, S., Ruhe, A., Wold, H., & Dunn, III, W. J. (1984). SIAM Journal on Scientific and Statistical Computing, 5(3), 735-743. This article focuses on discussing the use of partial least squares method for handling collinearities between the independent variables X in multiple regression. This method is similar to the conjugate gradient method used in Numerical Analysis for various related problems.
- The estimation of the parameters of a linear regression system obeying two separate regimes, Quandt, R. E. (1958). Journal of the American statistical association, 53(284), 873-880. This paper discusses how when trying to estimate the parameters of a linear regression system subject to two distinct regimes, the position of the point in time at which one regime was switched with another must first be estimated.
- Tests of the hypothesis that a linear regression system obeys two separate regimes, Quandt, R. E. (1960). Journal of the American statistical Association, 55(290), 324-330. This paper explores multiple approaches to test the hypothesis that no switch has occurred in the real values of the parameters of a linear regression system
- Bayesian model averaging for linear regression models, Raftery, A. E., Madigan, D., & Hoeting, J. A. (1997). Journal of the American Statistical Association, 92(437), 179-191. This paper presents two approaches which are an ad hoc procedure used to indicate a small set of models that can aid in computing a model average and a Markov chain Monte Carlo approach that is used to estimate the exact solution directly.
- Best linear unbiased prediction in the generalized linear regression model, Goldberger, A. S. (1962). Journal of the American Statistical Association, 57(298), 369-375. This paper derives a linear unbiased predictor to exploit the pattern of sample residuals containing information useful in the prediction of post-sample drawings when interdependence of disturbances exists in a regression model.
- Evaluation of fuzzy linear regression models, Savic, D. A., & Pedrycz, W. (1991). Fuzzy sets and systems, 39(1), 51-63. This article proposes the development of a linear regression model using enhanced minimal vagueness criterion and sets up two separate experiments to evaluate both models.
- Linear regression limit theory for nonstationary panel data, Phillips, P. C., & Moon, H. R. (1999). Econometrica, 67(5), 1057-1111. In this paper, a regression limit theory is created and used for non-stationary panel data with huge numbers of cross-section and time series observations.
- Linear regression analysis, Regression, I. L. (1977). This article explains linear regression analysis as the study of linear and additive relationships existing between variables.