IBES summary data problems
Posted: 25 September 2011 09:59 AM   [ Ignore ]
Sr. Member
RankRankRankRank
Total Posts:  169
Joined  2011-09-20

Joost,

I found the following reasons why people calculate their own mean and standard deviation of forecast from a paper,
instead of using the summary data of IBES to get mean and std directly.

————————————
As Diether, Malloy and Scherbina (2002) and Zhang (2006), among others,
point out, there are two problems with the I/B/E/S standard-issue summary
data set.

First, the inclusion of stale (thus less meaningful) forecasts tends to
increase dispersion.

Second and more importantly, a rounding error problem
occurs when I/B/E/S adjusts all data for stock splits but only rounds the
estimate to the nearest cent. When applying the adjustment factor back,
the dispersion of forecasts is falsely altered.

I deal with these issues by using only quite recent forecasts and calculating the mean and standard deviation
of forecasts in my own sample, instead of directly using the aggregate data
as provided in the summary data set of I/B/E/S

—————————————
above information are quoted from section 4 of the follwoing paper:
A new measure of analyst eanrings forecast dispersion
Wenjun Xie
this paper could be download from SSRN.com

 Signature 

Zenghui
A humble student of business

Profile
 
 
Posted: 25 September 2011 11:34 AM   [ Ignore ]   [ # 1 ]
Sr. Member
RankRankRankRank
Total Posts:  169
Joined  2011-09-20

The idea original is from: Page 2118 of

Diether, K., Malloy, C., Scherbina A., 2002, Differences of opinion and the cross section of
stock returns, The Journal of Finance 57, 2113-2141.

 Signature 

Zenghui
A humble student of business

Profile
 
 
Posted: 25 September 2011 09:50 PM   [ Ignore ]   [ # 2 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

hi Zenghui,

Since a few years IBES has the ‘unadjusted’ tables (i.e., no adjustments for stock splits etc). Stale forecasts would still be an issue.

It looks like “DETU_EPSUS” (I/B/E/S Detail History - Unadjusted (EPS for US Region)) would be a good starting point to build the set without stale forecasts.
Or possibly the ‘normalized’ equivalent “NDETU_EPSUS” (Normalized I/B/E/S Detail History - Unadjusted (EPS for US Region)) (IBES manual says ‘normalized’ datasets has forecasts on firms where the firms keep the same currency over time, i.e. no currency switches).

Do you want to give it a shot?

best regards,

Joost


 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 26 September 2011 11:16 AM   [ Ignore ]   [ # 3 ]
Sr. Member
RankRankRankRank
Total Posts:  169
Joined  2011-09-20

Joost,

Yes, i am going to give it a try.

here is what i think about the logic of the program (calculate the analyst coverage, analyst consensus forecast, analyst forecast error and analyst forecast dispersion).

1. select ibes data between the time range interested. (for a example, one month before the actual EPS annouced).
2. sort by the analyst code, and select only the last forecast. drop the duplicate (early forecasts from the same analyst.
3. count the observation covering the same firm, the total will be the N, analyst coverage.
4. calculate the mean of the forecast, this will be consensus forecast.
5. calculate the different between the mean and the actual annouced EPS, this will be the error of the forecast. (maybe i should take the abs value here).
6. if N>2, calculate the STD of the data, this will be the analyst forecast dispersion.

One quick question here, since we will use Unadjusted data, should we do adjustment ourselves? ie. if the stock price is 1.40 dollars, and the split factoris 2, should i use 0.7 dollar here?

Let me know if you agree about my logic above.

Thanks,

 Signature 

Zenghui
A humble student of business

Profile
 
 
Posted: 26 September 2011 12:40 PM   [ Ignore ]   [ # 4 ]
Sr. Member
RankRankRankRank
Total Posts:  169
Joined  2011-09-20

*This code is used to replace part of ECR example SAS code to calculate the consensus forecast;

*here is only the first part, download the details data.

rsubmit;
libname ibes ‘/wrds/ibes/sasdata’;
proc upload data=myLib3.g_withibes out=getIbes;

PROC SQL;
create table ibesdata as
select a.*, b.REVDATS, b.VALUE, b.CNAME, b.ACTUAL /*actual = actual eps, value = estimated eps, revdats = last date estimate valide */
from getIbes a, ibes.DET_EPSUS b
where
a.ibes_ticker = b.ticker
and b.MEASURE=“EPS”
/*and b.FISCALP=“QTR”*/ /*we don’t have periodicity = “QTR” option in DET_EPSUS*/
and b.FPI = “6”
and a.rdq - 90 < b.REVDATS < a.rdq +1 /*review date within two month of annoucement of actual EPS*/
and a.datadate -5 <= b.FPEDATS <= a.datadate +5
;
quit;

proc download data=ibesdata out=myLib3.h_ibesEstimate;
run;

endrsubmit;

Joost,

I have two questions so far:
1.  i am using review date as the control, if the estimate review date is less than EPS annouce date - 60 days, i download this data.
2.  we don’t have the control of periodicity = “QTR” option in DET_EPSUS, will this be a problem?

Will put more code on here once i got them running without errors.

Cheers,
Zenghui

 Signature 

Zenghui
A humble student of business

Profile
 
 
Posted: 26 September 2011 03:38 PM   [ Ignore ]   [ # 5 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

hi Zenghui,

I have not used the detail records before. I ran this code to get a feel for this dataset:

rsubmit;
libname ibes ‘/wrds/ibes/sasdata’;

PROC SQL;
create table ibesdata as
select a.*
from ibes.DET_EPSUS a
where
a.oftic = “GOOG”
;
quit;

proc download data=ibesdata out=myLib3.z_google;
run;

endrsubmit;


The review date seems to be the date to use (this variable is also used when it is not a revision, but a first annoucement).
I am not sure about requiring FPI to equal 6. I see some observations where the revision is posted one day before the earnings announcement, where FPI equals 8.
It looks like requiring FPI to be >= 6 will work though. This will drop annual forecasts, and quarterly forecasts remain. Requiring a date range for REVDATS will get rid of the stale forecasts.

Instead of “a.rdq - 90 < b.REVDATS < a.rdq +1” I think I would do “a.rdq - 90 < b.REVDATS < a.rdq - 1”. (I am not sure if this happens, but if there are revisions after the announcement, it is probably better not to include them.)

One thing that the code seems to be missing is only taking the last expectation for each analyst. Currently, when an analyst revises in the window, it will end up multiple times in this set.

Something like this should work (I only ran it on the sample Google set):
- first sort the observations so that oldest revision date comes first ;
proc sort data = myLib3.z_google ; by cusip FPEDATS ANALYS descending REVDATS ;
run;

- then, use nodupkey to get rid of older revisions/announcements:
* this code will drop duplicate ‘keys’ (only first REVDATS will be kept);
proc sort data = myLib3.z_google nodupkey out = myLib3.z_google2 dupout=myLib3.z_google_dropped;
by cusip FPEDATS ANALYS ;
run;

The ‘key’ in the nodupkey sort is “cusip FPEDAT ANALYS”, meaning, we only include a single revision for each analyst-cusip-quarter. For inspection purposes dupout includes all obs that are dropped.

One way to test the code is to use STATPERS before the earnings announcement (RDQ), and set the window to a very long period (a year or longer). In that case, the results would need to match those that are on the summary file (STATSUMU_EPSUS).

When using stock price as the scalar, use the historic stock price (unadjusted for stock splits).

hope this helps,

Joost

 

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 26 September 2011 09:50 PM   [ Ignore ]   [ # 6 ]
Sr. Member
RankRankRankRank
Total Posts:  169
Joined  2011-09-20

Joost,

I really appreciate your kind comments.

I didn’t fully catch your idea to test with sumary ibes file.  Do you mean by setting the window long, we can still drop the older analyst forecast, and do the same calculation? I guess i didn’t get why using the long period will benifit in our test like this. could you please explain a little more on this point?

Many Many thanks,

 Signature 

Zenghui
A humble student of business

Profile
 
 
Posted: 26 September 2011 09:52 PM   [ Ignore ]   [ # 7 ]
Sr. Member
RankRankRankRank
Total Posts:  169
Joined  2011-09-20

Joost,

To use the unadjusted stock price, the benifit is easy to compare with the historical price data. right?

if we only have the adjusted stock price, we will have to adjusted back by # of new stock issued, then compare with the historical price data. Am i correct here?

thank you once more,

 Signature 

Zenghui
A humble student of business

Profile
 
 
Posted: 27 September 2011 09:57 AM   [ Ignore ]   [ # 8 ]
Sr. Member
RankRankRankRank
Total Posts:  169
Joined  2011-09-20

Joost,

a. FPI >= 6

gives an error as the following:
ERROR: Expression using greater than or equal (>=) has components that are of
    different data types.

I think there are values like N, O, P… etc in this variable, and the values I need is 6, 7, 8, 9.

I tried a. FPI =  “6” or “7”,
and
a.FPI = “6,7”
a.FPI = “6 or 7”
it doesn’t work.

Any suggest about how to format this?

Thanks,

 Signature 

Zenghui
A humble student of business

Profile
 
 
Posted: 27 September 2011 10:15 AM   [ Ignore ]   [ # 9 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

hi Zenghui,

Sorry, FPI is a string (character), not numeric.

where FPI IN (“6”, “7”, “8”, “9”)

assuming there is no need to go beyond “9”.

About adjusted stock price versus ‘historic’ stock price (unadjusted). If papers scale by stock price when computing their measures (unexpected earnings, dispersion, etc), it makes sense to use the ‘historic’ (unadjusted) stock price when using the unadjusted forecasts.
In the ERC example, I am using prccq, which is the (unadjusted) stock price at the end of the quarter.

If you want to use the stock price at a different date (for example, the stock price 2 days before the earnings announcement), you can retrieve it from crsp.dsf. This table holds a field for stock splits and other adjustments.

Edit: I have posted some code in a reply below to compare end of year stock prices in Compustat, and stock prices in crsp.DSF, as well as split factors.
As an additional benchmark, I use finance.yahoo.com.

Zenghui - 26 September 2011 09:50 PM

I didn’t fully catch your idea to test with sumary ibes file.  Do you mean by setting the window long, we can still drop the older analyst forecast, and do the same calculation? I guess i didn’t get why using the long period will benifit in our test like this. could you please explain a little more on this point?

Many Many thanks,

I meant you could test if the SAScode by running it twice:
- first, have RDQ as the revision date, and set the window to 3 months; this will reduce staleness
- second, use STATPERS as the revision date, and the window to 12 months; the forecast should match the forecast on the summary files. If it doesn’t match, there is a mistake, because STATSUMU forecasts are constructed using DET_EPSUS


kind regards,

Joost

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 27 September 2011 10:35 AM   [ Ignore ]   [ # 10 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

hi Zenghui,

Query to get NVDA stock price, at the end of each year (fiscal year end is January). prcc_c is stock price end of calendar year (unadjusted), prcc_f is end of fiscal year (end of January) stock price (unadjusted), csho (common shares outstanding), adjex_f (adjustment factor, cumulative).

rsubmit;
 
libname comp ‘/wrds/comp/sasdata/naa’;
           
PROC SQL;
  create table rdq(keep = gvkey fyr datadate prcc_c prcc_f csho adjex_f) as
  select a.* 
  from comp.funda a
  where tic eq “NVDA”
      and indfmt=‘INDL’ and datafmt=‘STD’ and popsrc=‘D’ and consol=‘C’ ;
  quit;
 
proc download data=rdq out=myLib3.z_nvda_comp;run;
 
endrsubmit;
 
Compare with the monthly stock prices (finance.yahoo.com, select NVDA, select historic stock prices, range jan 1 2000 - jan 1 2010, daily, below table is link to csv):
http://ichart.finance.yahoo.com/table.csv?s=NVDA&a=00&b=01&c=2000&d=00&e=01&f=2010&g=d&ignore;=.csv

And, compare with crsp.dsf:

rsubmit;
   
libname crsp ‘/wrds/crsp/sasdata/sd’ ;
 
PROC SQL;
  create table nvdaPrc as
  select *
  from crsp.dsf  
  where
      cusip = “67066G10” ; *NVDA;
  quit;
 
proc download data=nvdaPrc out=myLib3.z_NVDA_ret; run;

endrsubmit;
 
Stock price (PRC) in crsp.dsf matches with prcc_c and prcc_f in Compustat.
Cumulative Factor to Adjust Prices (CFACPR) matches with Cumulative Adjustment Factor by Ex-Date - Fiscal (adjex_f) in Compustat.

best regards,

Joost

 

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 27 September 2011 11:56 AM   [ Ignore ]   [ # 11 ]
Sr. Member
RankRankRankRank
Total Posts:  169
Joined  2011-09-20

Joost,

I put the your code and my code together for the analyst forecast here.
In case I need it in the future and to share it with other people.

Thanks for the code for verify the stock price.

Zenghui

——————————-

*this code compute 3 company’s analyst forecast characteristics;
*N, coverage;
*forecast error by each analyst;
*average forecast error by all analyst;
*forecast dispersion;


*First clear the log and output for this program;*************************;
dm “out;clear;log;clear;”;
*setup the output format for easier reading;
options ls = 78 ps = 66;
**************************************************************************;


******************tell sas where to put data files;***********************;
libname myLib6 “C:\Users\Zenghui\Dropbox\01 accounting research\01 SAS\sas learning\forecast error”;
********************connect to WRDs;***************************************;
%let wrds = wrds.wharton.upenn.edu 4016;options comamid = TCP remote=WRDS;
signon username=_prompt_;
*************************************************************************;

rsubmit;
libname ibes ‘/wrds/ibes/sasdata’;

PROC SQL;
create table ibesdata as
select a.*
from ibes.DET_EPSUS a
where
a.oftic in (“GOOG”,“AAPL”,“MSFT”)
and a.MEASURE= “EPS”
and a.FPI in (“6”,“7”,“8”,“9”)  /*Control FPI to select only the quarter forecasts*/

/* the following 3 lines mabye useful in event study like ERC example on wrds.us */
/*and a.rdq - 90 < b.REVDATS < a.rdq -1*/
/*review date within two month of annoucement of actual EPS*/
/*and a.datadate -5 <= b.FPEDATS <= a.datadate +5*/ 
;
quit;

/*download data to local computer*/
proc download data=ibesdata out=myLib6.z_google;
run;

endrsubmit;


proc sort data = myLib6.z_google ; by cusip FPEDATS ANALYS descending REVDATS ;
run;
/*only keep the last forecast of each analyst*/
proc sort data = myLib6.z_google nodupkey out = myLib6.z_google2 dupout=myLib6.z_google_dropped;
by cusip FPEDATS ANALYS ;
run;

data myLib6.z_google3;
set myLib6.z_google2;
if ACTUAL ne .; /*drop missing actual value obs*/
forecast_error = ACTUAL - value; /*Calculate forecast error by each analyst*/
run;


proc sort
data = myLib6.z_google3
out = myLib6.z_google4;
by cusip FPEDATS;
run;
proc means data = myLib6.z_google4 mean median maxdec=5 std N noprint;
OUTPUT OUT = myLib6.z_google5 N(value)= coverage mean(value)= avgforecast std(value)= dispersionforecast median(value)= medianofforecast mean(forecast_error) = aveforecasterror ;
var value forecast_error; /*calculate average forecast error for a perticular firm and perticular quarter*/
by cusip FPEDATS;
run;

proc print
data=myLib6.z_google5 (obs=10);
run;

 Signature 

Zenghui
A humble student of business

Profile