1 of 2
1
help with downloading 10-ks from sec edgar
Posted: 08 May 2012 07:24 PM   [ Ignore ]
Newbie
Rank
Total Posts:  4
Joined  2012-05-08

Hello everyone,

I want to download the 10-ks for all the firms from 1998 to 2010 and save it in text format. Can anyone tell me where I can find the master file from SEC website and whether there is any SAS code to download all the information?

Thank you.

Profile
 
 
Posted: 08 May 2012 09:00 PM   [ Ignore ]   [ # 1 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

hi,

The urls for the SEC filings are like this:
http://www.sec.gov/Archives/edgar/full-index/<YYYY>/QTR<Z>/company.idx

for example, quarter 4, 2008:
http://www.sec.gov/Archives/edgar/full-index/2008/QTR4/company.idx

I have downloaded these manually, but I suppose you can load it from SAS (Google “SAS file open url”).

best regards,

Joost

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 08 May 2012 09:16 PM   [ Ignore ]   [ # 2 ]
Newbie
Rank
Total Posts:  4
Joined  2012-05-08

Thanks for the reply. Do I have to download url for each quarter or is there any url for each year? Also once I download the url how can I get the text file? I need to create a folder of 10-k for all the firms between 1998 to 2010. Do you know any sas code for the entire procedure? I am new and have no idea how to do it other than manuallydownloading each file.

Profile
 
 
Posted: 08 May 2012 09:23 PM   [ Ignore ]   [ # 3 ]
Newbie
Rank
Total Posts:  4
Joined  2012-05-08

I found this sas code online. But am not sure how to use it after I download all the url for 1998-2010.

proc sql;
create table a_10k as select
distinct cik, coname as edgarConame, filename as url,
date as filingdate10k, formtype
from edgar.filings b
where formtype IN (“10-K”);
quit;

data a_10k;
set a_10k;
downloadID = _N_;
blank =0;
run;

proc sql;
create table b_downlaodlist as
select downloadId, url, blank from a_10k;
run;quit;

proc export data=b_downloadlist
outfile = DATAFILE= “c:\temp\c_10K_list.txt”
dbms =txt replace;
run;

Any guidance will be appreciated.

Profile
 
 
Posted: 09 May 2012 07:29 AM   [ Ignore ]   [ # 4 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

ok,

Did you see this post: http://www.wrds.us/index.php/repository/view/25

It is a dataset with all SEC filings 1994-2010. Just select the ones with form type is 10-K and you will have what you are looking for, right?

You would then run the code you mention above to create a textfile with a download list. As included here as well (http://www.wrds.us/index.php/tutorial/view/26) you could then use Perl to download the actual filings (text/html files) from Edgar.

best regards,

Joost

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 17 March 2014 04:00 AM   [ Ignore ]   [ # 5 ]
Newbie
Rank
Total Posts:  6
Joined  2013-10-18

Dear Joost,

Thank you for the helpful responses. I would like to know if these codes would work with JMP.
My other challenge is how to download data from Compustat for companies that have international operations.
I would prefer to use a search query, if possible, rather than one company at a time. My objective
is to download financial statement variables for an accounting research model.

Thanks for your assistance.

Kipper.

Profile
 
 
Posted: 17 March 2014 01:08 PM   [ Ignore ]   [ # 6 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

hi Kipper,

I am not sure what JMP is.

Do you know how to identify firms with international operations?
One dataset that may help is the following:  https://sites.google.com/site/scottdyreng/
It holds the unique countries of the firm’s subsidiaries (taken from Exhibit 21 in the annual report).

The wrds web interface allows you to upload a list of companies that you would like to download. A somewhat more advanced setting would be to download the datasets to your own machine and match it locally with other datasets.

Hope this helps,

Joost

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 18 March 2014 12:19 AM   [ Ignore ]   [ # 7 ]
Newbie
Rank
Total Posts:  6
Joined  2013-10-18

Hi Joost,

I did not notice that was an old thread.

Thanks for the response, and the link to Scott Dyreng’s data set.

I downloaded a list of companies with CIK identifiers. For each firm-year, how do I download the BS and IS variables I need for my analysis?

I have reviewed the code instructions on the forum and hence my question about whether the code can be run in JMP. I should be asking this of the SAS people, really.

As I understand it, the SAS code would fill out the table with the required data from WRDS. Is that correct?

By the way, JMP is a SAS package but I don’t think it can do as much as PC-SAS.

I hope these questions are not too much to ask, or even if they make sense.

Thanks a million.

Kipper

Profile
 
 
Posted: 18 March 2014 06:48 AM   [ Ignore ]   [ # 8 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

hi Kipper,

There are several ways of getting to the Compustat data.
- use of wrds website: http://wrds-web.wharton.upenn.edu/wrds/ds/compm/funda/index.cfm?navGroupHeader=Compustat Monthly Updates&navGroup=North America
select CIK and upload a file with the CIK codes of the companies
- using SAS remote; where you would connect to the wrds server, upload a dataset with the CIK codes, match with Compustat (and get the variables you need), and download the resulting dataset
- download the SAS datasets that you need from wrds using SSH (so you have a local copy) and do a merge (locally)

I am not familiar with JMP, so I wouldn’t know how that would work.

best regards,

Joost

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 17 November 2014 08:31 PM   [ Ignore ]   [ # 9 ]
Newbie
Rank
Total Posts:  17
Joined  2014-11-16

Hi Joost,

Your SEC filings data from 1993-2012 is sweet!

Two issues for me:

1) I’m interested in the 10-K and 10-Q filings in your dataset and additional 10-K and 10-Q filings through say 12/31/2013. The kicker is that my school has no license to the WRDS SEC filings data. I *think* I can get the 2013 data via the index files from FTP/EDGAR data. I know there is a more efficient way to get this additional data (and I’m all ears for an ideas), but ... I could use the form.idx data from the full-index folder (choosing the 10-K and 10-Q forms) ... choose the year 2013 folder ... then for each quarter, simply copy and paste the output displayed on the screen into a text file or into an Excel CSV format file, then import the Excel file into SAS, and simply append to your dataset.  I’m sure there is a more efficient solution, and I welcome any ideas.

2) Assuming that I somehow manage to pull off the feat described above (LOL), I’m looking to add a variable/column that contains the complete submission file size.  I know PERL has a command which computes the filesize, something like ...  my$filesize = -s “filename.txt\n”  Any ideas for pulling off the file size idea?  Not the sharpest tool in the shed, but willing to pursue/work at the issue to make up for the lack of intel. 

Thanks for your attention and any ideas! 

Profile
 
 
Posted: 17 November 2014 09:04 PM   [ Ignore ]   [ # 10 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

hi wrkrbeee,

How about this: use the current 1993-2012 data for now, and beginning 2015 (when the master index files for 2014 are available) I will update the dataset to include 2013 and 2014?

About the file size; it probably makes sense to strip HTML markup before measuring the filing length. A package that can do that is HTML::StripScripts.

If that doesn’t sound appealing and you want to go with raw file size, you can use File::stat, see http://perldoc.perl.org/functions/stat.html

On that page, this piece of code looks useful.

use File::stat;
    
$sb stat($filename);
    
printf "File is %s, size is %s, perm o, mtime %s\n",
           
$filename$sb->size$sb->mode 07777,
           
scalar localtime $sb->mtime

You would use this ‘inside’ a loop over all files in the directory. Something like this:

#!/usr/bin/perl

    
use strict;
    use 
warnings;
    use 
File::stat;

    
my $dir '/tmp';

    
opendir(DIR$dir) or die $!;

    while (
my $file readdir(DIR)) {

        
# Use a regular expression to ignore files beginning with a period
        
next if ($file =~ m/^\./);

        
$sb stat($filename);

 print 
"$file,$sb->size\n";

    
}

    closedir
(DIR);
    exit 
0

Let me know how it goes..

best regards,

Joost

 

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 17 November 2014 10:06 PM   [ Ignore ]   [ # 11 ]
Newbie
Rank
Total Posts:  17
Joined  2014-11-16

Hi Joost!

Dude, that’s freaking fantastic!

You update that dataset and I’m gonna start believing in Santa Claus again.  LOL

I would be forever grateful man.

My school has several WRDS licenses, just not wealthy enough to get stuff like the SEC Analytic Suite, EVENTUS, etc.

Knowing the updated dataset is covered, I can begin working to get the file size feature working using your existing dataset. 

Will take some time, but I can get that working.

Once, the file size feature works, I will know if my research idea is gonna fly using your existing dataset.

Dude, let me tell ya, not many people with access to the big time data (WRDS) reach out to help out smaller players.

And I am impressed to say the least. 

You’re the best!

Many many thanks!

 

Profile
 
 
Posted: 18 November 2014 02:59 AM   [ Ignore ]   [ # 12 ]
Newbie
Rank
Total Posts:  6
Joined  2013-10-18

Just thought I might chime in here to see if it would help. Loughran and McDonald have provided file size information for past periods. You may only need to update current information to get the job done. Try this link: http://www3.nd.edu/~mcdonald/Word_Lists.html

Good Luck.

Profile
 
 
Posted: 18 November 2014 08:27 AM   [ Ignore ]   [ # 13 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

hi wrkrbeee,

Glad you like it smile

Thanks for the link, Kipper; looks useful.

best regards,

Joost

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 18 November 2014 09:25 AM   [ Ignore ]   [ # 14 ]
Newbie
Rank
Total Posts:  17
Joined  2014-11-16

Thanks a bunch Kipper!!

That link will be very helpful!

Really appreciate your effort!

Profile
 
 
Posted: 20 November 2014 08:13 PM   [ Ignore ]   [ # 15 ]
Newbie
Rank
Total Posts:  17
Joined  2014-11-16

Hi Joost!

Thanks again for the file size loop.

Have it working, no errors or warnings.

Used two different operators to compute size.

Just lacking formatted output.

Attached the code, and two small data items.

Have any thoughts?

Thanks for your insight!

File Attachments
test.pl  (File Size: 1KB - Downloads: 0)
0001144204-09-017307.txt  (File Size: 8KB - Downloads: 189)
0001327459-09-000004.txt  (File Size: 10KB - Downloads: 243)
Profile
 
 
   
1 of 2
1