In this example an event study is performed. You will need a login for WRDS in order to retrieve the required data from Compustat.


As a result of mistakes/fraud and changes in accounting principles, firms may need to change prior financial statements. Normally (but not always) these restatements lead to a lower profitability than previously reported. Hence, it is to be expected that on average the announcement of such a restatement leads to a lower stock price. The event study methodology is the appropriate tool to investigate the relation between a restatement announcement and a stock price reaction. In this tutorial it is assumed that you are familiar with this method. The purpose of this tutorial is to show how the event study can be performed 'mannually' with WRDS and SAS (as opposed to using a package like Eventus).

The GAO dataset

The U.S. Government Accountability Office (GAO) has released several studies investigating restatements. These studies include information about the firms, the date of the press release and the reasons for the restatement. (For the restatements during January 1997 - September 2005, the reasons are included. The set from October 2005 - June 2006 does not include the reason.)

Here you may obtain the underlying documents:
January, 17th, 2003: pdf: Financial Statement Restatement Database
August, 31st, 2006: pdf: Financial Restatement Database
Data files in csv format (not including 1997-2002)


The GAO dataset has been converted into .txt (text, tab delimited) format, with indicator variables for the 12 different reasons as identified by GAO. The SAS code (below) will import this dataset, collect the neccessary data from WRDS, manage the data and perform regression analysis. Download the dataset (right-click and select "save target as").

The reasons identified by GAO and the corresponsing indicator name:

reason1: Acquisitions and mergers
reason2: Cost or expense
reason3: IPR&D
reason4: Loan-loss
reason5: Other
reason6: Reclassification
reason7: Related-party transactions
reason8: Restructuring, assets, or inventory
reason9: Revenue recognition
reason10: Securities related
reason11: Tax related
reason12: Unspecified

Note that the reasons are not mutually exclusive. A single restatement may have several reasons.

SAS code

The SAS code illustrates the following:
- the use of multiple local libraries
- importing a text file into SAS
- looking up PERMNO (firm identifier used by CRSP) based on a ticker symbol and the date of the press release
- creating a unique key for each restament (some firms have multiple restatements)
- deleting observations as a result of multiple classes of shares for a given company (e.g. class A and class B shares)
- using the 'spedis' function to measure the level of similarity in the CRSP company name and the GAO company name
- using the 'INTNX' function to generate a start/end date for collecting 30 months of returns
- verifying the availability of data (in this case: the number of monthly returns available for each observation)
- performing 'bulk' regression (time series estimation of beta for each restatement)
- computing cumulative abnormal return (CAR) for each restatement
- excluding observations where the announcement date was on a non-trading day (weekend or holiday)
- managing outliers: winsorizing data
- regression analysis

Copy/paste the SAS code (see below) into the editor, or download it (right-click and select "save target as").


Normally, additional steps are taken to make sure there are no confounding events. The appropriate procedure would include:
- matching on Compustat [/wrds/comp/sasdata/naa, dataset 'fundq', variable rdq: Report Date of Quarterly Earnings], exclude observations with an earnings announcement around the event date
- matching on Compustat Filings [/wrds/comp/sasdata/filings, dataset 'filings'], and exclude observations with a SEC filing around the event date (such as 8-K, 10Q/K, etc)
- matching on First Call Company Issued Guidance [/wrds/firstcall/sasdata, dataset 'cig'] and exclude firms that issue guidance around the event date
- manually verifying there are no major news events around the event date with Factiva or Lexis-Nexis

In this example the market index return is used for:
- monthly returns: calculating the beta for each observation
- daily returns: calculating the abnormal return for the 3 days surrounding the press release
To also use the market indices in other studies, a seperate library is constructed 
to hold the index datasets. This allows appending the market return to datasets locally. 
libname myLib2 "D:\_examples\sasdata2";
libname indices "D:\_examples\sasdata_indices";
/* retrieve monthly and daily index */
%let wrds = 4016;options comamid = TCP remote=WRDS;
signon username=_prompt_;
/* Even though "Home -> Support -> Data -> Dataset List" shows "/wrds/crsp/sasdata/ix"
as the location of the library, this location triggers the following error:
ERROR: User does not have appropriate authorization level for library CRSP.
Instead, "/wrds/crspq/sasdata/ix" needs to be used (q implies that the dataset 
is updated quarterly).
Thus, this library assignment may vary across institutions (depending on permissions)
libname crsp '/wrds/crspq/sasdata/ix';
/* msix: (Monthly)NYSE/AMEX/NASDAQ Capitalization Deciles, Annual Rebalanced
/* dsix: (Daily) NYSE/AMEX/NASDAQ Capitalization Deciles, Annual Rebalanced 
variables of interest: 
- vwretd: Value-Weighted Return-incl. dividends
- ewretd: Equal-Weighted Return-incl. dividends
- caldt: Calendar Date */
  create table monthlyIndex(keep = permno caldt ret vwretd ewretd) as
  select a.*
  from crsp.msix a;
  create table dailyIndex(keep = permno caldt ret vwretd ewretd) as
  select a.*
  from crsp.dsix a;
proc download data=monthlyIndex out=indices.monthlyIndex;
proc download data=dailyIndex out=indices.dailyIndex;
/* import the dataset */
PROC IMPORT OUT= myLib2.a_tickers
            DATAFILE= "D:\_examples\example2_restatements.txt"
            DBMS=TAB REPLACE;
ticker symbols change over time, it is preferable to use PERMNO as the firm 
identifier for stock price/return info However, the GAO dataset has the ticker 
symbol as the firm identifier.
How can we get the correct PERMNO if the ticker symbol changes over time?
Solution: we use 'dsenames', which keeps track of changes in ticker symbol, 
firm name, PERMNO, etc NAMEDT holds the beginning date, and NAMEENDT the end 
We therefore match on ticker symbol AND also the press release date needs to 
be between these dates
libname crsp '/wrds/crsp/sasdata/sd';
proc upload data=myLib2.a_tickers out=tickers;run;
  create table withcusip 
	(keep = firmname ticker press_date exchange cusip permno ncusip comnam 
	reason1-reason12 NAMEDT NAMEENDT) as
  select a.* , b.*
  from crsp.dsenames a , tickers b
  where a.ticker = b.ticker 
	and a.NAMEDT <= press_date <= a.NAMEENDT;
proc sort data=withcusip nodup; by ticker press_date ;
proc download data=withcusip out=myLib2.b_withcusip;
/* create an identifier for the observations
since a firm can have multiple restatements, an obvious candidate for such
an identifier is a combination of both
'||' concatenates the press release date (converted to a number) with the ticker 
symbol. hence: key is unique for each observation
data myLib2.b_withcusip;
set myLib2.b_withcusip;
key = trim(press_date || "_")|| trim(ticker);
/* sometimes firms have multiple permno's
if so, then there will be multiple observations with the same key.
in that case, take the first observation */
proc sort data=myLib2.b_withcusip nodup; by key permno;
/* note the 'by' statement, this will group all observations by their key
the 'if first.key then output' will only output the first observation, hence, 
there will be no multiple instances of a single key
data myLib2.b_withcusip;
set myLib2.b_withcusip;
by key;
if first.key then output;run;
As an additional test, verify if the name in the press release (provided by GAO)
is sufficiently comparable to the name on crsp 'spedis' is a function that compares
 to strings, the lower the score, the more the strings are alike observations with 
a score smaller than 30 are retained (this cutoff is arbitrary)
data myLib2.c_verifyThese (keep = permno COMNAM firmname);
set myLib2.b_withcusip;
name_dist = spedis(lowcase(firmname), lowcase(comnam));
if name_dist >= 30;run;
the dataset d_verifyThese contains 142 observations that are manually inspected
of these 144 observations, the names of 6 firms (9 observations) does not match
permno of firms with different company names: 
85654, 81547, 80142, 90173, 49429, 89303 
these observations are hence excluded from the sample
data myLib2.d_verified (drop = NAMEDT NAMEENDT);
set myLib2.b_withcusip;
if permno ne 49429;
if permno ne 80142;
if permno ne 81547;
if permno ne 85654;
if permno ne 89303;
if permno ne 90173;
/* compute a beta for each observation */
/* create a small dataset with the neccessary info to estimate beta
month_1 - month_30 is a 30 month window
collect 30 months of montly returns for the observation
and add the return on the index for each month
an observation's beta is the correlation between the 30 months and the index
data myLib2.e_getBeta (keep = key permno press_date  month_1 month_30);
set myLib2.d_verified; 
format month_1  date.;
format month_30 date.;
libname crsp '/wrds/crsp/sasdata/sm';
proc upload data=myLib2.e_getBeta out=getBeta;run;
  create table returnData
	(keep = key permno press_date month_1 month_30 date ret) as
  select msf.*, b.*
  from crsp.msf a, getBeta b
  where b.month_1 <= <= b.month_30 and a.permno = b.permno;
proc sort data = returnData nodup;by key date;
proc download data=returnData out=myLib2.f_returnData;
data myLib2.f_returnData;
set myLib2.f_returnData;
if RET gt -55;		* missing: -66, -77, -88 etc;
if (1*RET eq RET) ;	* must be numeric;
/* append the return on the index*/
  create table myLib2.g_withIndex
	(keep = key permno press_date date ret vwretd) as
  select a.*, b.*
  from indices.monthlyIndex a, myLib2.f_returnData b
  where a.caldt = ;
/* make sure there are enough monthly returns to estimate beta */
proc sql;
	create table myLib2.h_numMonths as
			distinct key, count(*) as number_of_months
		group by key;			
/* go back to g_withIndex and only select those observations that have s
ufficient number of months; arbitrary cutoff: 12 months*/
proc sql;
	create table myLib2.i_readyToEstimate as
	select a.* 
	from myLib2.g_withIndex a, myLib2.h_numMonths b
	where a.key = b.key
		and b.number_of_months >=12
/* estimate beta's */
PROC REG OUTEST = myLib2.j_beta data=myLib2.i_readyToEstimate;
   ID key;
   MODEL  ret = vwretd / NOPRINT;
   by key;
/* the coefficients have the names of the variables,
hence beta equals vwretd in this dataset
renamed here and only keep the relevant variables: key and beta*/
data myLib2.j_beta2 (keep = key alpha beta);
set myLib2.j_beta;
alpha = intercept;
beta = vwretd;
/* include alpha and beta in myLib2.d_verified */
proc sql;
	create table myLib2.k_verifiedAndBeta as
	select a.*, b.alpha, b.beta
		myLib2.d_verified a, myLib2.j_beta2 b
		a.key = b.key;
/* 3 day stock return [-1, +1]*/
libname crsp '/wrds/crsp/sasdata/sd';
proc upload data=myLib2.k_verifiedAndBeta out=get3Day;run;
data get3Day;
set get3Day;
format prevday nextday date9.;
  create table return3Day(keep = key permno date press_date ret alpha beta) as
  select dsf.*, get3Day.*
  from crsp.dsf a, get3Day b
  where b.prevday <= <= b.nextday and a.permno = b.permno;
proc download data=return3Day out=myLib2.l_return3Day;
data myLib2.l_return3Day;
set myLib2.l_return3Day;
if RET gt -55;		* missing: -66, -77, -88 etc;
if (1*RET eq RET) ;	* must be numeric;
/* append daily return on index */
  create table myLib2.m_withdailyIndex(keep = key permno press_date alpha 
beta date ret vwretd) as
  select a.*, b.*
  from indices.dailyIndex a, myLib2.l_return3Day b
  where a.caldt = ;
proc sort data = myLib2.m_withdailyIndex; by key date;run;
/* compute CAR's */
data myLib2.n_CAR (keep = key r_3day car_vw counter alpha beta  );
set myLib2.m_withdailyIndex;
/* by key means that it will run through these statements for each key,
where key is a unique identifier of one observation
by key;
/* retain means that the contents of these variables will be available
for example, counter will be set to zero with every observation 
with every row of data (return), it will increase by 1
"if last.key then output" will write the value of the counter to the newly
created dataset
r_3day: 1 + 3 days of returns
car_vw: 3-day cumulative abnormal return 
retain counter r_3day  car_vw;
if first.key then counter=0;
if first.key then car_vw=1;
if first.key then r_3day=1;
if ret eq . then ret = 0;
/* cumulative abnormal return */
car_vw = car_vw + ret - (alpha+beta * vwretd);
/* raw return, 3 day cumulative */
r_3day = r_3day + ret;
/* we are only interested in keeping the cumulative 3 day return (and not 
cumulative 1 and 2 day) */
if last.key then output;
/* match CAR with myLib2.k_verifiedAndBeta */
proc sql;
	create table myLib2.o_withCAR as
	select a.*, b.car_vw, b.r_3day, b.counter
		myLib2.k_verifiedAndBeta a, myLib2.n_CAR b
		a.key = b.key;
data myLib2.p_allVariables;
set myLib2.o_withCAR;
car_vw_log = log(car_vw);
r_3day_log = log(r_3day);
/* exclude firms where the event date was no trading day
   for example the press release was issued on a sunday
   or on a holiday */
proc sql;
	create table myLib2.q_finalSet as
	select a.*
		myLib2.p_allVariables a
		a.press_date IN (select caldt from indices.dailyIndex);
Trim or winsorize macro
* byvar = none for no byvar;
* type  = delete/winsor (delete will trim, winsor will winsorize;
*dsetin = dataset to winsorize/trim;
*dsetout = dataset to output with winsorized/trimmed values;
*byvar = subsetting variables to winsorize/trim on;
%macro winsor(dsetin=, dsetout=, byvar=none, vars=, type=winsor, pctl=1 99);
%if &dsetout = %then %let dsetout = &dsetin;
%let varL=;
%let varH=;
%let xn=1;
%do %until ( %scan(&vars,&xn)= );
    %let token = %scan(&vars,&xn);
    %let varL = &varL &token.L;
    %let varH = &varH &token.H;
    %let xn=%EVAL(&xn + 1);
%let xn=%eval(&xn-1);
data xtemp;
    set &dsetin;
%if &byvar = none %then %do;
    data xtemp;
        set xtemp;
        xbyvar = 1;
    %let byvar = xbyvar;
proc sort data = xtemp;
    by &byvar;
proc univariate data = xtemp noprint;
    by &byvar;
    var &vars;
    output out = xtemp_pctl PCTLPTS = &pctl PCTLPRE = &vars PCTLNAME = L H;
data &dsetout;
    merge xtemp xtemp_pctl;
    by &byvar;
    array trimvars{&xn} &vars;
    array trimvarl{&xn} &varL;
    array trimvarh{&xn} &varH;
    do xi = 1 to dim(trimvars);
        %if &type = winsor %then %do;
            if not missing(trimvars{xi}) then do;
              if (trimvars{xi} < trimvarl{xi}) then trimvars{xi} = trimvarl{xi};
              if (trimvars{xi} > trimvarh{xi}) then trimvars{xi} = trimvarh{xi};
        %else %do;
            if not missing(trimvars{xi}) then do;
              if (trimvars{xi} < trimvarl{xi}) then delete;
              if (trimvars{xi} > trimvarh{xi}) then delete;
    drop &varL &varH xbyvar xi;
%mend winsor;
/* invoke macro to winsorize */
%winsor(dsetin=myLib2.q_finalSet, dsetout=myLib2.r_finalWinsorized, byvar=none, 
vars=car_vw_log r_3day_log , type=winsor, pctl=1 99);
/* run regressions by exchange */
proc sort data = myLib2.r_finalWinsorized; by exchange;run;
   MODEL  car_vw_log = reason1-reason12 ;
     by exchange;

All rights reserved. © 2010-2014 [Copyright] [Privacy Statement] [Disclaimer] [About]