Matching compustat and edgar filings
Posted: 01 November 2012 02:58 PM   [ Ignore ]
Newbie
Rank
Total Posts:  23
Joined  2012-10-24

Dear Joost,

I am trying to match compustat and edgar filings using cik_Gvkey linktable on compustat (compnames_cik).
The problem is cik provided by edgar filings in this forum have 5 or 6 digits, but cik on linktable have 10 digits.
Have you encounter this kind of problem, Joost?
Thank you.


pwyw000

Profile
 
 
Posted: 01 November 2012 03:28 PM   [ Ignore ]   [ # 1 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

hi pwyw000,

I see - could you point which file on this forum has 5 or 6 digits? It really should be 8-10 or so.

thanks,

Joost

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 01 November 2012 03:31 PM   [ Ignore ]   [ # 2 ]
Newbie
Rank
Total Posts:  23
Joined  2012-10-24

http://www.wrds.us/index.php/repository/view/25
I got the file from above link.
Some cik even have 4 digit. So I am confused.
Thanks.

pwyw000

Profile
 
 
Posted: 01 November 2012 06:18 PM   [ Ignore ]   [ # 3 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

hi pwyw000,

Ok, the dataset in that url in sorted by CIK, so the first firms have very ‘low’ CIK numbers.

In Funda, CIK is a ‘character’ field, for example, a CIK of ‘48465’ is represented as ‘0000048465’. I believe the matching will work when you convert the character CIK in Funda to a numeric field (for example with numCIK = 1* CIK).

By the way, CIK in Funda is the current CIK, while the SEC filings use the historic CIK. This means that you won’t get a match for firm-years for that changed their CIK (the years prior to the CIK change won’t match).

best regards,

Joost

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 01 November 2012 08:28 PM   [ Ignore ]   [ # 4 ]
Newbie
Rank
Total Posts:  23
Joined  2012-10-24

Got you, Joost. Given the issue of historic cik, is there any way to extract def 14a filings that match my sample companies?
Thank you.

pwyw000

Profile
 
 
Posted: 02 November 2012 07:35 AM   [ Ignore ]   [ # 5 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

sure, the dataset has a field ‘formtype’; restrict it to the form-types you are looking for (a datastep with “if” or “where”).

best regards,

Joost

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 02 November 2012 10:15 AM   [ Ignore ]   [ # 6 ]
Newbie
Rank
Total Posts:  23
Joined  2012-10-24

I guess I will have to hand match the years when cik changed over time.
But it is good to know that issue with sec cik.
Thanks a lot.

pwyw000

Profile
 
 
Posted: 02 November 2012 11:28 AM   [ Ignore ]   [ # 7 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

hi pwyw000,

Matching by hand is going to be time-consuming. I hope (for your sanity) that it is a small sample. smile

best regards,

Joost

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 02 November 2012 12:20 PM   [ Ignore ]   [ # 8 ]
Newbie
Rank
Total Posts:  23
Joined  2012-10-24

Hi Joost,

Say I want all the 10-k’s for my sample companies. I can restrict formtype to 10K and I can do that. But meanwhile I have to find 10Ks for a particular company at a particular year.
That imposes two criteria for matching. Could you show me how sql can do this?
Also, since you have mentioned “This means that you won’t get a match for firm-years for that changed their CIK (the years prior to the CIK change won’t match.”, there is nothing I can do to avoid the hand match for part of my sample, right? I was somewhat confused there.

pwyw000

Profile
 
 
Posted: 02 November 2012 02:09 PM   [ Ignore ]   [ # 9 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

hi pwyw000,

Assuming you have local access to Compustat Funda (for example, you downloaded a sample set of firms). Assuming you have a dataset with: gvkey, fyear, datadate, cik called work.a_comp.
Assuming you have unzipped the dataset with Filing details into work.b_edgar, then matching would go like this.

proc sql;
create table work.c_match as
select a.*, b.*
from a_comp a, b_edgar b
where a.cik = b.cik
and b.date <= a.datadate <= b.date + 120;
quit;

So, matching on firm identifier is done through cik, and matching on date is done by requiring the filingdate to be within 120 days after end of year (the 120 days is arbitrary of course).

On the firms that change CIK. Let’s suppose firm a has a CIK of 1000 in 2009, 2010 and changed to 1020 in 2011 and 2012. Funda will hold the current CIK (‘1020’) for all the available years. In other words, when CIK changes for a firm, all of the records in Funda are updated and show the new CIK. The old CIK will no longer be in Funda. So, matching on the new CIK ‘1020’ will return only filings for the years 2011 and 2012, and not the earlier years.

Hope this helps,

Joost
 

 

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 12 July 2015 09:40 PM   [ Ignore ]   [ # 10 ]
Newbie
Rank
Total Posts:  2
Joined  2015-06-29

hi Joost, regarding what you said “matching on the new CIK ‘1020’ will return only filings for the years 2011 and 2012, and not the earlier years.”, is there any sources that I can find out all the old CIK’s?

Joost Impink - 02 November 2012 02:09 PM

hi pwyw000,

Assuming you have local access to Compustat Funda (for example, you downloaded a sample set of firms). Assuming you have a dataset with: gvkey, fyear, datadate, cik called work.a_comp.
Assuming you have unzipped the dataset with Filing details into work.b_edgar, then matching would go like this.

proc sql;
create table work.c_match as
select a.*, b.*
from a_comp a, b_edgar b
where a.cik = b.cik
and b.date <= a.datadate <= b.date + 120;
quit;

So, matching on firm identifier is done through cik, and matching on date is done by requiring the filingdate to be within 120 days after end of year (the 120 days is arbitrary of course).

On the firms that change CIK. Let’s suppose firm a has a CIK of 1000 in 2009, 2010 and changed to 1020 in 2011 and 2012. Funda will hold the current CIK (‘1020’) for all the available years. In other words, when CIK changes for a firm, all of the records in Funda are updated and show the new CIK. The old CIK will no longer be in Funda. So, matching on the new CIK ‘1020’ will return only filings for the years 2011 and 2012, and not the earlier years.

Hope this helps,

Joost
 

Profile
 
 
Posted: 13 July 2015 07:17 AM   [ Ignore ]   [ # 11 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

WRDS is selling ‘WRDS SEC Analytics Suite’, which includes linktables between gvkey and CIK (also cusip-CIK, and another identifier, which I forgot). This matching table is not complete and will help identify some old CIKs.

Other than that, I don’t know.

Best,

Joost

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 29 September 2015 10:35 PM   [ Ignore ]   [ # 12 ]
Newbie
Rank
Total Posts:  14
Joined  2014-11-08
Joost Impink - 13 July 2015 07:17 AM

WRDS is selling ‘WRDS SEC Analytics Suite’, which includes linktables between gvkey and CIK (also cusip-CIK, and another identifier, which I forgot). This matching table is not complete and will help identify some old CIKs.

Other than that, I don’t know.

Best,

Joost


Dear Joost,

Do you know why did the company change their CIK number? I knew one reason is that the company had a M&A and then the company need to change the CIK. Thank you.

Regards,
Emily

Profile
 
 
Posted: 29 September 2015 10:42 PM   [ Ignore ]   [ # 13 ]
Newbie
Rank
Total Posts:  14
Joined  2014-11-08
fcong - 12 July 2015 09:40 PM

hi Joost, regarding what you said “matching on the new CIK ‘1020’ will return only filings for the years 2011 and 2012, and not the earlier years.”, is there any sources that I can find out all the old CIK’s?

Joost Impink - 02 November 2012 02:09 PM

hi pwyw000,

Assuming you have local access to Compustat Funda (for example, you downloaded a sample set of firms). Assuming you have a dataset with: gvkey, fyear, datadate, cik called work.a_comp.
Assuming you have unzipped the dataset with Filing details into work.b_edgar, then matching would go like this.

proc sql;
create table work.c_match as
select a.*, b.*
from a_comp a, b_edgar b
where a.cik = b.cik
and b.date <= a.datadate <= b.date + 120;
quit;

So, matching on firm identifier is done through cik, and matching on date is done by requiring the filingdate to be within 120 days after end of year (the 120 days is arbitrary of course).

On the firms that change CIK. Let’s suppose firm a has a CIK of 1000 in 2009, 2010 and changed to 1020 in 2011 and 2012. Funda will hold the current CIK (‘1020’) for all the available years. In other words, when CIK changes for a firm, all of the records in Funda are updated and show the new CIK. The old CIK will no longer be in Funda. So, matching on the new CIK ‘1020’ will return only filings for the years 2011 and 2012, and not the earlier years.

Hope this helps,

Joost
 


Hey fcong,

Did you find the sources that can help to find out all the old CIK’s? Thank you.

Regards,
Emily

Profile
 
 
Posted: 30 September 2015 07:35 AM   [ Ignore ]   [ # 14 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

hi Emily,

I don’t know the reasons why firms change their CIK, but the match from Funda is not very good. If I take all Compustat firmyears after 1996 (drop the ones with missing assets), and match them using CIK in Funda on the SEC CIK for 10-Ks that are filed (including “10-K405”, “10-KSB”, “10-KT”, “10KSB”,  “10KSB40”, “10KT405”), I have a match of about 70% firmyears, and is pretty much stable over time.
I know that some of the ‘missing’ matches are due to the fact that Funda uses the IPO prospectus to backfill prior to the IPO but that should explain only a portion. Many firms simply have a missing CIK in Funda.

I would love to know more about what is going on (and how to improve the match).

Best Regards,

Joost

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 15 October 2015 09:38 PM   [ Ignore ]   [ # 15 ]
Newbie
Rank
Total Posts:  14
Joined  2014-11-08
Joost Impink - 30 September 2015 07:35 AM

hi Emily,

I don’t know the reasons why firms change their CIK, but the match from Funda is not very good. If I take all Compustat firmyears after 1996 (drop the ones with missing assets), and match them using CIK in Funda on the SEC CIK for 10-Ks that are filed (including “10-K405”, “10-KSB”, “10-KT”, “10KSB”,  “10KSB40”, “10KT405”), I have a match of about 70% firmyears, and is pretty much stable over time.
I know that some of the ‘missing’ matches are due to the fact that Funda uses the IPO prospectus to backfill prior to the IPO but that should explain only a portion. Many firms simply have a missing CIK in Funda.

I would love to know more about what is going on (and how to improve the match).

Best Regards,

Joost

 

Dear Joost,

I only know that there are some changes during mergers and acquisitions.Thanks for your help.

Regards,
Emily

Profile