perl program for sec egdar ftp files returning no file found when file exists -pls help!
Posted: 04 September 2013 10:13 AM   [ Ignore ]
Newbie
Rank
Total Posts:  10
Joined  2013-09-04

Hi,

It would be great if someone can help. I really am stuck.

I am downloading the master files from SEC edgar and I got the script from—

I get the error 404 master.gz not found
While debugging i made it paste the url and when i use the same in browser I can download the file. It is parsing the url correctly till QTR1 but after that it is not able to find the file when it actually exists ..please help. The code below gets the master files from QTR folders. Pasting my code ::

—————-

Profile
 
 
Posted: 04 September 2013 12:15 PM   [ Ignore ]   [ # 1 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

hi Uma,

Which year-quarters get downloaded correctly, and which don’t?

It looks like you are only trying to download $year=1995, because $year < 1996 will only be true for 1995. What do you mean when you say it works until QTR1? Do you get 1995 Q4, Q3, Q2 as well as Q1? Or just Q4, Q3, and Q2?

best regards,

Joost

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 04 September 2013 12:34 PM   [ Ignore ]   [ # 2 ]
Newbie
Rank
Total Posts:  10
Joined  2013-09-04

Thanks Joost for the reply. Sorry for not explaining properly.

1) yes for debugging reasons now I changed the code to 1995 (but later plan to add years 1995 to 2012)


I wasted so much time on this seemingly simple thing which is supposed to work but it is just not working…. could you copy past this and run..is it working for you?

Thanks a lot for your help . Appreciate it.

Profile
 
 
Posted: 04 September 2013 12:53 PM   [ Ignore ]   [ # 3 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

hi Uma,

I download the index if I paste this url in the browser window:
ftp://ftp.sec.gov/edgar/full-index/1995/QTR1/master.gz

I copied your code, which (for me) also works.

Possibly this could be some firewall settings. It may be worth a try to use http:// instead of ftp://

For example:
http://www.sec.gov/Archives/edgar/full-index/1995/QTR1/master.gz

Let me know how things go..

best regards,

Joost

 

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 05 September 2013 02:03 AM   [ Ignore ]   [ # 4 ]
Newbie
Rank
Total Posts:  10
Joined  2013-09-04

Thanks joost for the reply and pasting and running the code.

http dint work either.

Funny thing is when i give wget ftp://anonymous:pw@ftp.sec.gov/edgar/full-index/1995/QTR1/master.gz
from the command line (with or without credentials) it works and the file is downloaded but the perl program just throws 404! So I donno whats happening

really frustrating , have so much thesis work and am stuc k at the beginning of pulling master files.

Profile
 
 
Posted: 05 September 2013 03:23 AM   [ Ignore ]   [ # 5 ]
Newbie
Rank
Total Posts:  10
Joined  2013-09-04

okay…the update is this

I have a windows machine and hence was working on my perl/unix stuff from a ubuntu virtual machine on my computer. I had a feeling the VM was causing an issue for some reason after your reply

When I downloaded perl for windows and ran my program there its working just fine! phew! ...atleast its working somewhere….

I have to dig down what is the issue with ubuntu vm that is causing ftp issues…

again Joost..thanks a bunch for executing for me and helping out!

Profile
 
 
Posted: 05 September 2013 06:51 AM   [ Ignore ]   [ # 6 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

hi Uma,

Glad it worked out! By the way, the master files are here also: http://www.wrds.us/index.php/repository/view/25

best regards,

Joost

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 05 September 2013 06:56 AM   [ Ignore ]   [ # 7 ]
Newbie
Rank
Total Posts:  10
Joined  2013-09-04

thanks a lot….since these were in sas i hesitated ...but thanks you never know I might sometime may look up this. Thanks a bunch

Profile
 
 
Posted: 05 September 2013 08:14 AM   [ Ignore ]   [ # 8 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

You’re welcome; I believe Stata is able to read SAS files though.

best regards,

Joost

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 05 September 2013 11:08 AM   [ Ignore ]   [ # 9 ]
Newbie
Rank
Total Posts:  10
Joined  2013-09-04

thanks for the info!

Profile
 
 
Posted: 11 September 2013 09:50 AM   [ Ignore ]   [ # 10 ]
Newbie
Rank
Total Posts:  10
Joined  2013-09-04

Joost - I hope you dont mind but I have a question and I looked around and even mailed sec edgar but got no response….

I know that we can find different form types such 8k 10k 10q and extract them off the server of sec. But do you know where I can find the letters to shareholders?

For example—> http://www.sec.gov/Archives/edgar/data/1551152/000104746912006434/a2209760zex-99_1.htm

Sorry this might be trivial but I am checking if somebody already knows. thanks a bunch

Profile
 
 
Posted: 11 September 2013 11:39 AM   [ Ignore ]   [ # 11 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

hi Uma,

I believe the following is the case: if you use sec’s edgar website, and select a 10-K (for example), it usually shows several documents.

See for example this 10-K (CIK 1800, fiscal year 2012):
http://www.sec.gov/Archives/edgar/data/1800/000104746911001056/0001047469-11-001056-index.htm

It has the 10-K, and several exhibits. (By the way, I tried to find the 10-K for Abbott (your example), but couldn’t immediately find it.)

On the page (link), it shows ‘Complete submission text file’; this includes the 10-K and all exhibits.

In the SEC edgar master filings dataset, the url for the 10-K of CIK 1800 is:  edgar/data/1800/0001047469-11-001056.txt
This matches the url on the website.

In other words, the 10-K filing (complete file) includes all exhibits.

If you look at the html source code (in browser, right-click ‘view source code’), the first 5 lines are:
<DOCUMENT>
<TYPE>EX-99.1
<SEQUENCE>2
<FILENAME>a2209760zex-99_1.htm
<DESCRIPTION>EX-99.1

In other words, you can scan the 10-K filing for exhibits (these will be opened by ‘<DOCUMENT>’ and closed by </DOCUMENT>’ with TYPE set to the EX-##. The (big) problem in your case is that you will need to do fancy text scanning to figure out what the exhibits contain (these can contain anything). I don’t think there is a robust way of identifying a letter to shareholders.

I didn’t check, but I assume this would hold for 10-Q and 8-K filings as well.

Hope this helps,

Joost

 

 

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile
 
 
Posted: 11 September 2013 12:44 PM   [ Ignore ]   [ # 12 ]
Newbie
Rank
Total Posts:  10
Joined  2013-09-04

Thanks joost…you are a savior and you have explained everything so clearly…how can I thank you?!

thanks a lot for the prompt help!

you are right there is no clear consistent way to get this info. It would be better to probably see if databases have this stored for each firm I guess. Thanks aagain smile

Profile
 
 
Posted: 11 September 2013 02:15 PM   [ Ignore ]   [ # 13 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  901
Joined  2011-09-19

you’re welcome smile

Joost

 Signature 

To reply/post new questions: Please use the group WRDS/SAS on Google Groups! http://groups.google.com/d/forum/wrdssas

Profile