Merging compustat
Posted: 21 August 2015 11:04 AM   [ Ignore ]
Newbie
Rank
Total Posts:  12
Joined  2015-08-04

Hi, I have a database that I would like to merge with compustat. The database key identifiers are tic, cusip, and fyear plus the data variables. I would like to merge this data with compustat.
First, I would like to merge the data with compustat based on the cusip (cnum)  and fyear. Then, I would like to take the unmatched data and merge it with compustat based on
tic and fyear. Then, combine the two into one dataset. Would this be the correct code? All data is sorted appropriately before running the merge. Or if there is a simpler way to merger based on one identifier and then another identifier that would be great. Thanks in advance.

****merge compustat and kld****;
proc sort data=perm.fkld9109edited out=test;
by cyear cnum;
run;

proc sort data=compcomplete out=comp1asortcnum;
by cyear cnum;
run;

data mergercnum;
merge test (in=a) comp1asortcnum (in=b);
by cyear cnum;
if a and b;
run;


proc sort data=perm.fkld9109edited out=test1;
by cyear tic;
run;

proc sort data=compcomplete out=comp1asortcnum2;
by cyear tic;
run;

data mergertic;
merge test1 (in=a) comp1asortcnum2 (in=b);
by cyear tic;
if a and b;
run;

data all;
set mergercnum mergertic;
run;

proc sort data=all out=zfinal nodupkeys;
by cyear gvkey;
run;

Profile
 
 
Posted: 21 August 2015 03:04 PM   [ Ignore ]   [ # 1 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

hi Jwhi121,

Maybe I am overlooking it, but I can’t find the code that you leave out the matched obs with cusip (if you have a match based on cusip, I wouldn’t match on ticker/fyear for that observation).

Other than that: are you positive both kld and Compustat use the same ‘version’ of Cusip? Compustat uses the header (current) Cusip - I don’t know about kld; but that would be something to verify (if kld uses historic cusip, firms that changed their cusip over time will not match).

Other than that, it looks all fine to me.

Best,

Joost

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 21 August 2015 03:10 PM   [ Ignore ]   [ # 2 ]
Newbie
Rank
Total Posts:  12
Joined  2015-08-04

Hi,
I will have to verify about the kld and compusat information.

***Maybe I am overlooking it, but I can’t find the code that you leave out the matched obs with cusip (if you have a match based on cusip, I wouldn’t match on ticker/fyear for that observation).****

For the above comment, would a sort on gvkey and fyear with nodupkeys be okay?

thanks again for the help. i appreciate it. Trying to get the hang of this sas programming is challenging!
-jc

Profile
 
 
Posted: 21 August 2015 03:18 PM   [ Ignore ]   [ # 3 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

Yes, I think you are good—data all; set mergercnum mergertic; run; will have the matches on cusip first, and matches on ticker/fyear after that. So if the same gvkey fyear is matched using both, it will keep the first match (which uses cnum). It is not very explicit though (may be confusing in the long run).

Best,

Joost

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile