Proc Reg vs. Proc RobustReg
Posted: 01 March 2012 09:11 AM   [ Ignore ]
Newbie
Rank
Total Posts:  2
Joined  2011-12-06

I recently used “PROC ROBUSTREG” in order to identify outliers and leverage points. I used the following code, and it worked very well:

proc robustreg data=sample_data;
model Y = x1 x2 / diagnostics leverage;
output out=regress_data_reg3 r=resid sr=stdres;
run;

Just to be safe and to compare, since I was not too familiar with PROC ROBUSTREG, I also ran the PROC REG on the same data.
The outputs from these two commands are different, but contain many of the same data items. I have two questions though:
1. The R-squared was different for these two commands even though it is using the same data. Why?
2. The different R-square noted above most likely comes from the fact that PROC ROBUSTREG adds a variable titled “SCALE”. This adjusts the intercept coefficient, but my variable coefficients were not changed. Has anyone seen this? Is the new “scale” variable the reason the R-square dropped? How to interpret the coefficient of the “SCALE” variable?

Thanks,

Profile
 
 
Posted: 02 March 2012 01:18 PM   [ Ignore ]   [ # 1 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

hi Devw18,

I found some background info on Robustreg here: http://www.lexjansen.com/phuse/2006/st/st01.pdf

The difference between OLS and ROBUSTREG is that different weights are given for outliers. OLS gives an equal weight to all observations, so outliers/leverage points ‘pull’ the regression line. With M-estimation (as implied by your code) gives a lower weight to these obs. So, the estimation line will be pulled less, resulting in a worse R-squared. The estimated coefficients should differ as well.

best regards,

Joost

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile