Monday, November 12, 2012

Why ANOVA and Linear Regression Are the Same Analysis

If your graduate statistical training was anything like mine, you learned ANOVA in one class and Linear Regression in another. My professors would often say things like "ANOVA is just a special case of Regression," then do a lot of hand waving when pressed to explain.

It was not until I started consulting that I realized how closely related ANOVA and regression are. They're not only related, they're the same thing. Not a quarter and a nickel--different sides of the same coin.

So here is a very simple example that shows why. When someone showed me this, a light bulb went on, even though I already knew both ANOVA and mulitple linear regression quite well (and already had my masters in statistics!). I believe that understanding this little concept has been key to my understanding the general linear model as a whole--its applications are far reaching.

Why ANOVA and Linear Regression Are the Same Analysis

As an example, I use a model with a single categorical independent variable--employment category--with 3 categories: managerial, clerical, and custodial. The dependent variable is Previous Experience in months. (This data set is employment.sav, one of the data sets that comes free with SPSS).

We can run this as either an ANOVA or a regression. In the ANOVA, the categorical variable is effect coded, which means that each category's mean is compared to the grand mean. In the regression, the categorical variable is dummy coded**, which means that each category's intercept is compared to the reference group's intercept. Since the intercept is defined as the mean value when all other predictors = 0, and there are no other predictors, the three intercepts are just means.

In both analyses, Job Category has an F=69.192, with a p Clerical: 85.039
Custodial: 298.111
Manager: 77.619

In the Regression, we find these coefficients:

Intercept: 77.619
Clerical: 7.420
Custodial: 220.492

The intercept is simply the mean of the reference group, Managers. The coefficients for the other two groups are the differences in the mean between the reference group and the other groups.

You'll notice, for example, that the regression coefficient for Clerical is the difference between the mean for Clerical, 85.039, and the Intercept, or mean for Manager (85.039 - 77.619 = 7.420). The same works for Custodial.

So an ANOVA reports each mean and a p-value that says at least two are significantly different. A regression reports only one mean (as an intercept), and the differences between that one and all other means, but the p-values evaluate those specific comparisons.

It's all the same model, the same information, but presented in different ways. Understand what the model tells you in each way, and you are empowered.

I suggest you try this little exercise with any data set, then add in a second categorical variable, first without, then with an interaction. Go through the means and the regression coefficients and see how they add up.

**The dummy coding creates two 1/0 variables: Clerical = 1 for the clerical category, 0 otherwise; Custodial = 1 for the custodial category, 0 otherwise. Observations in the Managerial category have a 0 value on both of these variables, and this is known as the reference group.

Why ANOVA and Linear Regression Are the Same Analysis
Check For The New Release in Health, Fitness & Dieting Category of Books NOW!
Check What Are The Top Cooking Books in Last 90 Days Best Cheap Deal!
Check For Cookbooks Best Sellers 2012 Discount OFFER!
Check for Top 100 Most Popular Books People Are Buying Daily Price Update!
Check For 100 New Release & BestSeller Books For Your Collection

And now I'd like to invite you to learn more about linear regression analysis, including interpreting interactions, centered predictors, polynomials, and more in one of my FREE monthly Analysis Factor Teleseminars: "Interpreting Linear Regression Parameters: A Walk Through Output." Visit Teletraining 4 to get started today.

© 2008 Karen Grace-Martin -- Statistical Consultant and founder of The Analysis Factor

watches cell phone Save 13 On Trademark Miller Girl In The Best Price 36 Sampson Grate For 121 44 Best Buy Roc N Soc Nitro Throne Black

0 comments:

Post a Comment