about ccmxpf_linktable?
Posted: 21 June 2012 11:04 AM   [ Ignore ]
Newbie
Rank
Total Posts:  11
Joined  2012-06-21

Is anyone aware of any problem with ccmxpf_linktable?

I used it to merge CRSP/COMPUSTAT for replicating a paper. I got a lower number of firms than that reported in the paper.
The reason I guess is that there are much fewer links back to the early period of the sample (around 1950), isn’t it?

If it is, does anyone know how to solve this?

Thanks,

Profile
 
 
Posted: 21 June 2012 12:30 PM   [ Ignore ]   [ # 1 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

hi Nan,

I am not aware of the issue. If the paper at hand has a table with numobs, then you might compare it with this (year of start of link, not numobs by year):

linkdtyr   numobs
  1946       1
  1949     140
  1950     534
  1951     125
  1952     17
  1953     17
  1954     14
  1955     23
  1956     25
  1957     41
  1958     99
  1959     269

Which is the output of:
proc sql;
select distinct year(linkdt) as linkdtyr, count(*) as numobs from crsp.ccmxpf_linktable where year(linkdt) < 1960 group by year(linkdt) ;
quit;

best regards,

Joost

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 26 June 2012 11:14 AM   [ Ignore ]   [ # 2 ]
Newbie
Rank
Total Posts:  11
Joined  2012-06-21

Hi Joost,

Thanks a lot for your reply.

That paper reported that they can have over 150 firms that have been listed for at least 50 years since 1950 with non-missing total assets and share pirce.

I used the ccmxpf_linktable to merge CRSP and Compustat and required the fiscal-year end to fall within the link range. I ended up with
106 firms meeting their screenings.

So far I haven’t figured out the reason why I came short. Appreciated if you have any thought on this.

Profile
 
 
Posted: 26 June 2012 02:31 PM   [ Ignore ]   [ # 3 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

hi Nan,

Would you mind uploading your sas-code in a text file to your previous post?

best regards,

Joost

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 26 June 2012 03:10 PM   [ Ignore ]   [ # 4 ]
Newbie
Rank
Total Posts:  11
Joined  2012-06-21

Hi Joost,

Thanks for your reply. I just attached my sas code to my previous post.

The program generates a dataset called mydata.sas7bdat, which is the CRSP/Compustat merged file from the original dataset comp.sas7bdat. comp.sas7bdat contains
all firms that shows up in Compustat in 1950 and have non-missing total assets up to at least 2000.

If you count the number of firms satisfying the requirements of starting from 1950 with nonmissing total assets and share price in mydata, it reports 106 if I
got it right.

Let me know if it is my code problem or other possible reasons. Thanks!

Best,
Nan

Profile
 
 
Posted: 26 June 2012 03:39 PM   [ Ignore ]   [ # 5 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

hi Nan,

I took another angle; I started with ccm link file.

Does Funda already have 2011 on file? If not, it doesn’t look good.

best,

Joost

/* matching.ccm is the ccm link file, but with linkenddt replaced with future date when ‘missing’ (still active) */

data z;
set matching.ccm;

/* firms with at least 50 year activity—333 firms selected, but this is
oversampled, because I have used 1/1/2020 as linkenddt when still active */
years = year(linkenddt) - year(linkdt);
if years >=50;
run;

/* Compustat Funda */

proc sql;

create table z2 as select
  a.*, b.fyear, b.at, b.prcc_f
from
  z a,
  comp.funda b
where
  a.gvkey = b.gvkey
and b.at ne .
and b.prcc_f ne .
and b.indfmt=‘INDL’ and b.datafmt=‘STD’ and b.popsrc=‘D’ and b.consol=‘C’ ;
;

quit;

/* Drop doubles */

proc sort data=z2 nodupkey; by gvkey fyear;run;

/* Count number of years */

proc sql;

create table z3 as select distinct gvkey, count(*) as numObs from z2 group by gvkey;

quit;

/* 163 obs if 49 as cutoff (I have Funda updated up to 2010, I believe 2011 is available)
only 28 obs using 50 as cutoff
*/

data z4;
set z3;
if numObs >=49;run;


data z5;
set z3;
if numObs >=50;run;

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 26 June 2012 03:55 PM   [ Ignore ]   [ # 6 ]
Newbie
Rank
Total Posts:  11
Joined  2012-06-21

Hi Joost,

I assume you used collapsed version of ccm link file (such as the one generated by ccm_lnktable.sas from WRDS sample codes), right?

If the requirement “year(datadate)=1950” and “fic=‘USA’” are imposed, the sample size will probably shrink.


Nan

Profile
 
 
Posted: 26 June 2012 04:34 PM   [ Ignore ]   [ # 7 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

hi Nan,

Look at this:

data z;
set matching.ccm;
if year(linkdt) <=1950;
run;

Pretty much all these matches ‘end’ at 1962. Then, a lot of these are ‘revived’ in 1962. So, 1962 seems to be a year where these links were rebuild.

For example:

/*  link is split over 3 periods: 1950 - 1962, 1962-1993, 1993- current */

data z2;
set matching.ccm;
if lpermco eq 21800;
run;

Did you take this into account? (Sorry - did not verify).

best,

Joost

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 26 June 2012 05:57 PM   [ Ignore ]   [ # 8 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

hi Nan,

This is what I have now:

best,

Joost


/* Gvkey starting on/before 1950 => 385 obs*/

proc sql;

create table z as select distinct gvkey, linkdt from matching.ccm where year(linkdt) <= 1950;
quit;

/* At least 50 years of coverage => 185 obs*/

proc sql;

create table z2 as select distinct a.*, b.* from z a, matching.ccm b where a.gvkey = b.gvkey and year(b.linkenddt) >= year (a.linkdt) + 50;
quit;

/* Starting year, ending year => 155 obs */

proc sql; create table z3 as select distinct gvkey, min(linkdt) as linkdt, max(linkenddt) as linkenddt from z2 group by gvkey; quit;

/* Format dates => 155 obs */

data z3;
set z3;
format linkdt linkenddt date9.;run;

/* US firms only => 149 obs*/

proc sql; create table z4 as select a.*, b.fic from z3 a, company.company b where a.gvkey = b.gvkey and b.fic = “USA”; quit;

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile