Lesson 8: Proc Print, Proc Sort and ODS
Getting More From Proc Print
Proc print is used to display data in the output window. The output can be saved, printed, or copied into a word processor or other program. To make the formatting come out right in a word processor, a monospace font (all letters take up the same amount of space) must be used. A computer with SAS installed should have the "SAS Monospace" font available; otherwise, choose something like "Courier." In previous lessons we have made frequent use of proc print, but we have only introduced a few options, namely label and data=. Actually the output of proc print is highly customizable. The documentation can be found in the documentation under "Base SAS Procedures Guide/Procedures/The PRINT Procedure." We will explore some of the more common options and statements here.
Recall the example from a previous lesson, where we showed that data set options can be used in a proc print statement.
As a demonstration of data set options, that was good, but it is not the way this is usually done. Proc print has a var statement which is used to list the variables that are to be printed. The proc step below will produce the exact same output. The var statement can also be used to change the order of the columns, as they will be printed in the same order they are listed in the var statement.
The noobs (no observation number) option can be added to the proc print statement to suppress printing the column of observation numbers.
Or, perhaps there is a variable in the data set that uniquely identifies each observation and is not the same as the observation number, such as a customer number, Social Security Number, or subject number in an experiment. In that case, add an id statement (that's pronounced eye-dee). The id statement causes the specified variable to be printed at the left instead of the observation number. Note that the id variable is not included in the var statement, if one is used (including it will cause it to be printed twice).
The label option was introduced in an earlier lesson. Consider the following program and output, with labels made up of several words. By default, SAS fits the labels on the column headings as best it can, using spaces to split the labels onto multiple lines. In this example it is not quite satisfactory. We notice that although "First Name" occupies two lines, "Last Name" does not. (data)
The split= option can be used to control where the breaks in the labels occur. When using the split option, the label option does not have to be specified because it is implied. Any character can be used to control the split. Often a space will do.
That fixes the "Last Name" problem, but now the "Question" headings take up four lines. Maybe that's not what we want. By specifying a different split character we can control where the splits occur.
Due to the wider headings, the variables are now printed in two sections. Note that the id variable is repeated in the second section (obs will do the same if used). Even if there are many observations, SAS will, by default, break each page into sections like this, if all the columns don't fit on one line. You can use a rows=page option to force each page to be all one section.
SAS has many rules for trying to decide what the best way to fit things on a page will be. Sometimes it prints the headings vertically. The direction can be forced using a heading= option. The arguments are v (or vertical) and h (or horizontal).
The sum statement causes SAS to calculate the sum of one or more variables. (data)
The by statement causes proc print to generate separate reports for each value of the by variable. (Data must be sorted by the by variables.)
And you can print subtotals for the by groups like this:
Getting More From Proc Sort
Data sets can be sorted using proc sort. A by statement lists one or more variables to use as sort-keys. The keyword descending can precede any variable for which the sort order is to be reversed. A data= option specifies the data set to sort; otherwise the most recently created data set is used. Here is a simple example that sorts by one column. The original data set is replaced by the sorted one.
In the example below, an option to create a new data set called two is included, along with a data set option specifying which variables to keep. This leaves the original data one as it is.
Any number of sort keys can be specified. The next example shows a sort by school, then score within school, in descending order.
Proc sort can also eliminate duplicate observations while sorting. The noduprecs and nodupkey options can be added to the proc sort statement. Noduprecs eliminates observations that are exactly alike for all the variables, while nodupkey eliminates those that have the same values in the sort key variables, even if there are differences in other variables.
Introduction to ODS
ODS stands for "Output Delivery System." When a SAS procedure produces output, it is actually producing data, which is then passed to the Output Delivery System, which determines what should be done with the output data.
ODS has "destinations," which refer to the type of output to be produced. For example, the "listing" destination is the output window. In this lesson, we will demonstrate two other destinations, "html" and "pdf." Another important one is "rtf" which produces documents you can import into word processors. Destinations are opened and closed as shown below. The ods keyword, followed by a destination name, will open the destination. To close a destination, include the keyword "close." In the example, the listing destination is closed, meaning no output will go to the output window. Then the html destination is opened, meaning html will be created. After proc print does its job, the destinations are returned to their previous state (always a good idea). We see a nice html document in the results viewer.
ODS can send output to multiple destinations. (We didn't need to close the listing destination--it was just done to provide an example.) It can also create files, as shown below.
SAS usually displays the html or pdf document in the Results Viewer window, even if it is written to a file. Here is a pdf displayed in the Results Viewer.
If you're preparing documents for publication, presentation, or the web, you may not be satisfied with the way this output looks. ODS allows us to modify the way documents look by applying styles. For example, the journal style is designed to be compatible with the requirements of many scientific journals:
A set of styles is installed with your SAS system. This little program will display the styles available on your system:
The file usedcars3.txt contains part of an inventory of used cars from a local car dealer. The variables are year, make, model, color, miles, inventory number, and price. Read this data into a SAS data set. Include descriptive labels for all the variables, at least some of which should be made up of several words (separated by spaces) for this exercise. Include formats for miles and price. Save this data set for the next lesson (think about what that means--hint: permanent). Don't forget to include titles that identify which question your output belongs to.
1) Print the data without labels and with the inventory number as an ID variable.
2) Print the data with labels, including inventory number as an ID variable. Use a statement (not data set option) so that only inventory number, make, model, year, and price, in that order, are printed.
3) Look at your headings in part 2. They will be printed either vertically or horizontally. Use an option to force them to print the other way.
4) Now take the label statement you put in the data set and insert it in proc print. Revise this label statement to include a split character and print all of the variables (without using an ID statement) with the split labels. Do not just replace spaces with a split character. Make sure to do something that will demonstrate the difference that using a split character makes. Continue using these labels for the rest of the parts below.
5) Sort the data in order of price from highest to lowest and print the result.
6) Print the data with columns for all the variables except color, but separate the observations into groups for each color, without any sums or subtotals.
7) Print the year, make, model, color, and price, but include subtotals and a grand total of the price for each make.
8) Sort the data in alphabetical order of make, and within make, in order of miles from lowest to highest. Include a nice title, suppress the page number and date, and put the result in a new data set and print it using the ODS pdf destination with journal style. Save the pdf file and submit it separately.
Copyright reserved by Dr. Dwight Galster, 2006. Please request permission for reprints (other than for personal use) from firstname.lastname@example.org. "SAS" is a registered trade name of SAS Institute, Cary North Carolina. All other trade names mentioned are the property of their respective owners. This document is a work in progress and comments are welcome. Please send an email if you find it useful or if your site links to it.