This example shows how SAS and LaTeX can be used to create articles in pdf format, while bypassing the use of Excel and Word. Text (introduction, literature review, etc.) obviously has to be typed either in Word or any other text-processor. Nevertheless, gains in efficiency can exist with respect to the markup of the text as well as creating and 'designing' the tables. This example focuses on tables by extending Measuring ERC (event study) . A table with the regression output for 10 deciles is created, including the coefficients and t-values.

The LaTeX outputting macro writes to the 'log'. This works with no problems with Base SAS, but I experienced problems with SAS Enterprise Guide (SAS EG), where I couldn't surpress logging of page numbers etc. Additionally, I have not yet gotten SAS EG to allow me to run 'SYS COMMAND', which is used in this example. In other words, this example works (best/only) with Base SAS.

When working on a paper, one can export a dataset with regression results from SAS to Excel and turn it into a nice table, to be included in Word. The main drawback of this approach is that it is manual. Time is spent on making the tables 'look nice', and with every update of the paper (changes in sample selection, different regression, etc), this process is repeated. Hence, a considerable amount of time is spent on text processing. There is an alternative, however, where the text processing is automated using LaTeX.

LaTeX is a text processing language where the markup is combined with the text itself, much like HTML (LaTeX predates HTML by ten years or so). Text and markup are mixed and simultaneously edited. The editor will compile the document into pdf format whenever requested. Obviously, typing regression results into the various cells of a table would be time-consuming. However, SAS can be used to *generate* the tables, including the mark-up. The gain is basically taking out 'the middlemen' (Word and Excel), and to directly generate the final paper in pdf format from SAS. This requires that the full article (not just the tables) needs to be written in LaTeX. Hence, the 'cost' or drawback of using LaTeX is getting to know LaTeX markup coding. This example hopefully aids in making the cost-benefit tradeoff of using LaTeX.

For this example to work, you will need a LaTeX text editor. For Windows, TeXworks is freely available. TeXworks includes an alternative for Endnote called BibTeX for managing references. Download TeXworks

In this particular example, multiple LaTeX files are involved. The 'main' LaTeX file (LaTeX Example.tex) includes an 'include'-statement (\input{table_reg_output.tex}). The file that is included holds the table and is generated by SAS. After creating this file, SAS executes LaTeX to compile the pdf.

Download the zip-file (right-click and select "save target as") containing the SAS code, the LaTeX code of the main document, the generated LaTeX code for the table, as well as the pdf.

A screenshot of the table in the pdf is included below.

The SAS code illustrates the following:

- use of macro variables (%LET varname = value)

- generating regression output including t-values, p-values, R squared

- converting a 'long' dataset into a 'wide' dataset (information in three rows is collected into a single row)

- generating LaTeX markup

- launching 'system commands' such as deleting a file

- running the LaTeX compiler from SAS

- launching Acrobat from SAS (to view the newly created pdf)

This example extends Learning by example 3: Measuring the earnings response coefficient (ERC). The same regression is ran, while saving more parameters. In example 3, only the coefficients were saved. In this example, also the t-statistics, p-values and R squared are used. Hence, it is necessary to run example 3 before the SAS code in this example will work.

To prevent including the code for the table multiple times (when re-running the script), an instruction is included that deletes the file. The following statements deletes the file:

&SC "del ""&outpath\&texfile..tex""" SHELL WAIT;

In the SAS code, there are instances where three quotes are used ("""). This is not a mistake.

/* This example uses the dataset u_finalWinsorized from the ERC event study example; */ libname myLib3 "G:\research\sas_projects\learn_accounting_wrds\example 3 ERC\sasdata"; /* Directory with this sas code */ %LET basepath=G:\research\sas_projects\learn_accounting_wrds\example 4 latex; /* Output will appear in a subdirectory 'Latex', make sure it exists */ %LET outpath=&basepath\Latex; /* LaTeX file (to be created by this script) containing table only */ %LET texfile=table_reg_output; /* 'Master' LaTex file, which is already in /latex folder */ %LET mainTexFile=LaTeX Example; /* The names of the variables used in the regression the variables VAR1, VAR2, etc need to be named in the same order in which they appear in the regression output table */ %LET VAR1=Intercept; %LET VAR2=preAnnRet_ln; %LET VAR3=unex; %LET VAR4=loss; %LET VAR5=loss_unex; /* Number of independent variables (including intercept) */ %LET DIM_VAR = 5; /* These variables will be pushed to the latex documents; latex does not want these to be quoted, hence, no quotes around the text;*/ %LET latexHeader =decile & constant & preAnnRet & unex & loss & loss x unex & n & R$^{\textrm{2}}$; /* Alignment of the columns: c is center, l is left align, r is right align; */ %LET latexColsAlign = cccccccc; /* Regression by size decile notice the additional keywords 'RSQUARE' and 'TABLEOUT' 'RSQUARE' will add the R-square as an additional row (and some other variables) 'TABLEOUT' will generate several rows containing t-values, p-values, confidence intervals, etc */ PROC REG OUTEST = myLib3.v_regOutput2 data=myLib3.u_finalWinsorized RSQUARE TABLEOUT; ID capn; MODEL car_ln = preAnnRet_ln unex loss loss_unex/ NOPRINT; by capn; RUN ; /* Create wide dataset with all relevant info for each regression in a single row (as opposed to 3 rows);*/ %MACRO makeWide; data myLib3.w_coeffs (keep = capn numObs RSQUARED COEF1-COEF20 TVALUE1-TVALUE20 PVALUE1-PVALUE20); set myLib3.v_regOutput2; by capn; /* The variables listed after 'retain' are retained over the rows since the info needed is in 3 different rows, this feature is needed; */ retain COEF1-COEF20 TVALUE1-TVALUE20 PVALUE1-PVALUE20 numObs RSQUARED; /* An array is used to use the various variables, for example aCOEF(3) refers to variable COEF3 this is helpful in do-while loops, where a counter is used (in this case I is used, which counts from 1 till 5)*/ array aCOEF(1:20) COEF1-COEF20; array aTVALUE(1:20) TVALUE1-TVALUE20; array aPVALUE(1:20) PVALUE1-PVALUE20; %DO I = 1 %TO &DIM_VAR; /* row with _TYPE_ "PARMS" holds the coefficients, R-squared and number of observations * number of obs is _EDF_ (Error degrees of freedom) + _P_ (Number of parameters in model); */ if _TYPE_ eq "PARMS" then do; RSQUARED = _RSQ_; numObs = _EDF_ + _P_; aCOEF(&I) = &&VAR&I; end; /* row with _TYPE_ "T" holds the t-values */ if _TYPE_ eq "T" then do; aTVALUE(&I)= &&VAR&I; end; /* row with _TYPE_ "PVALUE" holds the p-values */ if _TYPE_ eq "PVALUE" then do; aPVALUE(&I)= &&VAR&I; end; %END; if last.capn then output; run; %MEND makeWide; /* Invoke the macro */ %makeWide; /* Create variables latexCoef and latexSign holding the latex markup;*/ data myLib3.x_latexMarkup (keep = latexCoef latexSign); set myLib3.w_coeffs; array aCOEF(1:20) COEF1-COEF20; array aTVALUE(1:20) TVALUE1-TVALUE20; array aPVALUE(1:20) PVALUE1-PVALUE20; LENGTH latexCoef latexSign strTVal2 $5000.; latexCoef = put (capn, 2.0); * starts with decile number; latexSign = " "; * starts with empty column; DO i = 1 to 20 ; if aCOEF(i) ne . then do; latexCoef = strip(latexCoef) || " & " || put( Round(aCOEF(i), 0.001), 6.3) ; strTVal = put(abs(Round(aTVALUE(i), 0.001)), 6.2); strTVal2 = " &(" || strip(strTVal) || ")"; /* this would be the place to also add a symbol for significance at 10% */ if (aPVALUE(i) < 0.05) then strTVal2 = " &(" || strip(strTVal) || ")*" ; if (aPVALUE(i) < 0.01) then strTVal2 = " &(" || strip(strTVal) || ")**" ; latexSign = strip(latexSign) || strip (strTVal2); end; end; /* Add number of observations and Rsquared as seperate columns; */ latexCoef = strip(latexCoef) || " & " || put( Round(numObs, 0.001), COMMA6.0) ; latexCoef = strip(latexCoef) || " & " || put( Round(RSQUARED, 0.001), 6.3) || "\\"; latexSign = strip(latexSign) || " & & \\"; run; /* Output LaTeX code***********************************************************;*/ /* Some variables that need to be set Verify that the location/filename of Acrobat Reader is correct!! */ *%LET acrorpath=C:\Program Files\Adobe\Reader 8.0\Reader; *%LET acrorpath=C:\Program Files\Adobe\Reader 9.0\Reader; %LET acrorpath=C:\Program Files (x86)\Adobe\Reader 10.0\Reader; X CD &outpath; %LET qrep='?'; %LET SC=SYSTASK COMMAND; /* Macro to write latex text by Juha-Pekka Perttola, see http://www.lexjansen.com/phuse/2008/ts/ts06.pdf */ %MACRO t(_text,_opt); OPTIONS NOSOURCE NONOTES; %IF &_opt = n %THEN %LET _opt = new; DATA _NULL_; LENGTH _text $5000.; _text = SYMGET('_text') ; _text = COMPRESS(_text,"'"); _text = TRANSLATE(_text,"'",&qrep); CALL SYMPUT('_text',_text); RUN; PROC PRINTTO LOG = """&outpath\&texfile..tex""" &_opt; RUN; %PUT &_text; PROC PRINTTO;RUN; OPTIONS SOURCE NOTES; %MEND; /* Delete the output file in case it exists */ &SC "del ""&outpath\&texfile..tex""" SHELL WAIT; /* Table formatting, excluding the regression output; */ %t('\begin{table}'); %t('\caption{ERC by capitalization decile}'); %t('\label{ercByDecile}'); %t('\begin{center}'); /* Variables with latex markup are written here;*/ %t('\begin{tabular}{&latexColsAlign}'); %t(&latexHeader\\); %t('\hline'); /* Output generated latex markup data _null_ means that no SAS data set is created instead of using the %t-macro, it is 'PUT' manually (the t-macro does not work within a data step) */ data _null_ ; set myLib3.x_latexMarkup ; FILE """&outpath\&texfile..tex""" MOD; /* MOD: modify, appending to existing file;*/ PUT latexCoef ; PUT latexSign ; run ; /* Continue with latex, end of table;*/ %t('\hline'); %t('\end{tabular}'); %t('\par\medskip\footnotesize'); %t('* significant at 5\%; ** significant at 1\%'); %t('\end{center}'); %t('\end{table}'); /* Tell LaTeX to (re)compile the main document */ &SC "latex ""&outpath\&mainTexFile..tex""" SHELL WAIT; &SC "latex ""&outpath\&mainTexFile..tex""" SHELL WAIT; &SC "dvips ""&outpath\&mainTexFile..dvi""" SHELL WAIT; &SC "ps2pdf14 ""&outpath\&mainTexFile..ps"" ""&outpath\&mainTexFile..pdf""" SHELL WAIT; /* Launch Acrobat with pdf This could be omitted (it merely opens Acrobat Reader with the file). */ &SC """&acrorpath\AcroRd32.exe"" ""&outpath\&mainTexFile..pdf"""; &SC """&acrorpath\AcroRd32.exe"""; run;

Other Tutorials SAS |
---|

Saving time by using macros |

Downloading 10-K filings from SEC's EDGAR |

Latest forum posts |
---|

Have trouble with replicating the RESTATEMENT example by sxxapple |

how to read restatement data from GAO website? by Clark |

winsor by none? by Clark |

the tutorial to generate inflation adjusted time series. by pwyw000 |

Restatement Example Regression Results by Zenghui |

Generate our own erdport1 database from DSF file by Zenghui |

ROE example by Zenghui |

SAS SPEDIS function by Zenghui |