The Usefulness of the R-squared Statistic

Abstract
Almost every Actuarial Department uses least square regression to fit frequency, severity, or pure premium data to determine loss trends. Many actuaries use the R² statistic to measure the goodness-of-fit of the trend. Actually, the R² statistic measures how significantly the slope of the fitted line differs from zero, which is not the same as a good fit. In the Fall, 1991 Casualty Actuarial Society Forum, D. Lee Barclay wrote A Statistical Note On Trend Factors: The Meaning of R-Squared. Through simple graphical examples, Barclay showed that the coefficient of variation (R²) is, by itself, a poor measure of goodness-of-fit. Barclay's numerical examples provide additional support for this argument. But, his paper did not analyze the formulas used in regression analysis. By understanding the formulas and what they describe, we can further understand why the R² statistic is not a reliable measure of a good fit. This paper will analyze these formulas important to regression analysis: (1) the basic linear regression model, (2) the Analysis of Variance sum of squares formulas, and (3) the R² formula in terms of the sum of squares. With an understanding of these formulas and what they measure, actuaries can properly use the R² value to best determine the forecasted trend.
Volume
Winter
Page
55-60
Year
1998
Categories
Financial and Statistical Methods
Statistical Models and Methods
Regression
Actuarial Applications and Methodologies
Ratemaking
Trend and Loss Development
Publications
Casualty Actuarial Society E-Forum
Authors
Ross A Fonticella