I recently used “PROC ROBUSTREG” in order to identify outliers and leverage points. I used the following code, and it worked very well:
proc robustreg data=sample_data;
model Y = x1 x2 / diagnostics leverage;
output out=regress_data_reg3 r=resid sr=stdres;
run;
Just to be safe and to compare, since I was not too familiar with PROC ROBUSTREG, I also ran the PROC REG on the same data.
The outputs from these two commands are different, but contain many of the same data items. I have two questions though:
1. The R-squared was different for these two commands even though it is using the same data. Why?
2. The different R-square noted above most likely comes from the fact that PROC ROBUSTREG adds a variable titled “SCALE”. This adjusts the intercept coefficient, but my variable coefficients were not changed. Has anyone seen this? Is the new “scale” variable the reason the R-square dropped? How to interpret the coefficient of the “SCALE” variable?
Thanks,