Sas Basicsppt

download Sas Basicsppt

of 60

Transcript of Sas Basicsppt

  • 8/6/2019 Sas Basicsppt

    1/60

  • 8/6/2019 Sas Basicsppt

    2/60

    2

  • 8/6/2019 Sas Basicsppt

    3/60

    SAS Overview

    3

  • 8/6/2019 Sas Basicsppt

    4/60

    4

    ` Data Portion

    ` Rules for SAS name (Varibale/Data Set)

  • 8/6/2019 Sas Basicsppt

    5/60

    5

  • 8/6/2019 Sas Basicsppt

    6/60

    6

    ` Observation flow via Data step

  • 8/6/2019 Sas Basicsppt

    7/60

    ` Program data vector (PDV) is a logical area in memory where SAS builds a data set, oneobservation at a time. When a program executes, SAS reads data values from the input buffer (alogical area in memory into which SAS reads each record of raw data when SAS executes anINPUT statement) or creates them by executing SAS language statements. The data values areassigned to the appropriate variables in the program data vector. From here, SAS writes the valuesto a SAS data set as a single observation. When the DATA step reads a SAS data set, SAS readsthe data directly into the program data vector.

    1. The SAS automatic variables ( _N_ and _ERROR_ )

    2. Temp variable (First., Last. , IN , END etc)

    While reading a Raw data file:-

    ` Compilation Phase - It checks for syntax errors and conversion of data step into the machine codealong with creation of two things viz. INPUT BUFFER and PDV.

    `

    Execution Phase - Before beginning of the this phase all the variables will be initialized tomissing(if character) and periods(if numeric)[use "put _all_" statement to check PDV status in thelog]. Then when the input statement is encountered for the first time the first record from raw datafile is moved from it to INPUT BUFFER after that one observation is then again moved to PDV.When run statement is encountered, then the implicit OUTPUT statement forces the read data rowto the output data set.

    7

  • 8/6/2019 Sas Basicsppt

    8/60

    ` Sample programe

    Data Total_points (drop=TeamName); 1

    input TeamName $ ParticipantName $ Event1 Event2 Event3; 2

    TeamTotal + (Event1 + Event2 + Event3); 3

    datalines;

    Knights Sue 6 8 8

    Kings Jane 9 7 8Knights John 7 7 7

    ;

    Run;

    ` Values from the First Record are Read (Input buffer, PDV, Output)

    8

    Computed Value of the Sum Statement

    http://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a000961108.htm

  • 8/6/2019 Sas Basicsppt

    9/60

    ` Parts of a Program

    Introduction comments

    Data import

    Data manipulation/cleaning

    Data analysis

    Data export

    9

  • 8/6/2019 Sas Basicsppt

    10/60

    ` Semi-colon at the end of every statement

    ` Spacing for aesthetics, not syntax

    ` Comments

    /* Comment */

    * Comment ;

    */; - completes open comments from prior runs

    ` Run; at the end to execute the code

    10

  • 8/6/2019 Sas Basicsppt

    11/60

    ` OPTIONS statement

    Specifies SAS system options

    Often used at the beginning of a program to specify

    page formatting, limit the number of observationsprocessed, etc.

    Unless re-specified, options stay the samethroughout the entire SAS session for all DATA stepsand PROCs that execute (even if the currentprogram does not specify them).

    OPTIONNOPRINT MLOGIC SYMBOLGENpagesize=60 ;

    DM LOG 'CLEAR' LOG;

    11

  • 8/6/2019 Sas Basicsppt

    12/60

    ` A Libname is a reference to a directory.

    libname test c:\temp\;

    Must be used for a permanent SAS data set

    ` A Fileref is a reference to a specific file. filename flatfile c:\temp\test.txt;

    Provides an easy way to change the file used whentesting

    12

  • 8/6/2019 Sas Basicsppt

    13/60

    ` % Let allows you to create substitution for namesthroughout the data set

    Use % Let when high chance that it will change

    Allows user to make changes in one place, rather

    than throughout the program

    Often use % Let for path names and file names

    13

  • 8/6/2019 Sas Basicsppt

    14/60

    ` Example of a %Let statement

    %LET OUTPUTPATH=C:\Documents and Settings\MyDocuments\SAS99 v3\Testing v3\Output;

    Filename PROGFILE "&OUTPUTPATH.\PROGFILE.sas";Filename EXCLUDE "&OUTPUTPATH.\EXCLUDE.sas";

    Filename OUT_ENT "&OUTPUTPATH.\Entry_nonzero.xls";

    Filename OUT_DQ "&OUTPUTPATH.\Mat_Exceptions.xls";

    14

  • 8/6/2019 Sas Basicsppt

    15/60

    Importing Data into SAS

    15

  • 8/6/2019 Sas Basicsppt

    16/60

    ` SAS can import the following types of files:

    Text (fixed-length, variable-length and delimited)

    Excel workbooks

    Access databases

    DBF files

    EBCIDIC files (IBM Mainframe)

    16

    SAS File Transfer.sas

  • 8/6/2019 Sas Basicsppt

    17/60

    ` There are two types of data sets:

    Working

    x Temporary data set stored in the Work directory

    x Once SAS is closed, the data sets are deleted

    Permanent

    x Data sets stored in folder on computer

    x Use libname reference as a pipeline to point to

    permanent SAS datasets for programx If SAS is closed, data sets are saved on the

    computer

    17

  • 8/6/2019 Sas Basicsppt

    18/60

    ` SAS views dates as the number of days sinceJanuary 1, 1960.

    ` Since SAS considers dates to be numeric,

    calculations are easily performed.

    ` SAS reads and prints a wide variety of datesincluding mm/dd/yy, yyyymmdd, Julian, etc..

    ` YEARCUTOFF= system optionhttp://ftp.sas.com/service/techsup/tsnews-l/0040.html

    18

  • 8/6/2019 Sas Basicsppt

    19/60

    ` When raw data contains invalid dates (e.g. 000000),dates can be read in as character and then convertedto a date format to eliminate errors in the log. Forexample,

    IF DATEVAR1 NE '000000' THEN DATEVAR =INPUT(DATEVAR1,YYMMDD6.);

    ` Comparisons to SAS date constants

    IF date > 12312000; WRONG!

    IF date > 31dec2000d; RIGHT!

    ` SAS Date Functions

    19

  • 8/6/2019 Sas Basicsppt

    20/60

    ` Informat Tells SAS how to

    read data valuesinto a variable

    Used in inputstatements

    ` Format

    Tells SAS how towrite data valueseither to a file, areport, or a screen

    Used in outputstatements

    20

    DATA TEMP;INFILE "G:\Mydocument\GLDET.TXT" DELIMITER = '~'

    MISSOVER DSD LRECL=32767 FIRSTOBS=2 ;

    INFORMAT GL_ACCOUNT__ $33. ;

    INFORMAT GL_ACCOUNT_DESCRIPTION $128. ;

    INFORMAT ACCOUNT_TYPE_INDICATOR $1. ;

    INFORMAT FDC_CODE $10. ;

    FORMAT GL_ACCOUNT__ $33. ;FORMAT GL_ACCOUNT_DESCRIPTION $128. ;

    FORMAT ACCOUNT_TYPE_INDICATOR $1. ;

    FORMAT FDC_CODE $10. ;

    INPUT

    GL_ACCOUNT__ $

    GL_ACCOUNT_DESCRIPTION $

    ACCOUNT_TYPE_INDICATOR $

    FDC_CODE $

    ;

    RUN;

  • 8/6/2019 Sas Basicsppt

    21/60

    Data Analysis

    and Manipulation

    21

  • 8/6/2019 Sas Basicsppt

    22/60

    ` Once SAS imports the data, there are two types ofstatements for manipulation and analysis:

    Data Step

    x If-Then-Else statements

    x Mergesx Loops

    Proc Step

    x

    SAS-defined procedures for analysis (summaries,statistics, etc.)

    22

  • 8/6/2019 Sas Basicsppt

    23/60

    ` DATA steps in SAS are used to:

    Read in raw data files

    x Use INPUT statement

    Read in and manipulate SAS data sets

    x Use SETor MERGEstatements

    x Subset data

    x Create additional fields

    x Perform calculations

    x Merge SAS data sets

    x Append SAS data setsx Loop through SAS data sets

    x etc.

    23

  • 8/6/2019 Sas Basicsppt

    24/60

    ` The DROP= option eliminates the specified variables fromthe output data set. For example, DATA newdata1(DROP=field1 field2);

    ` The KEEP= option keeps only the specified variables in the

    output data set. For example, DATA newdata2(KEEP=field3 field4);

    ` The RENAME= option renames variables in the output dataset. This is commonly used for merges. For example,

    DATA new(REN

    AME=(oldname=newname));

    24

  • 8/6/2019 Sas Basicsppt

    25/60

    ` Example of a Data Step

    data rev_chart (keep = acct_type rev_debit_amt rev_debit_ct rev_credit_amtrev_credit_ct

    exp_debit_amt exp_debit_ct exp_credit_amt exp_credit_ct period);set input_data year_input_data;if Acct_Type = 'Revenue' then do;

    rev_debit_amt = debit_amt;rev_debit_ct = debit_cnt;rev_credit_amt = credit_amt;rev_credit_ct = credit_cnt;

    end;else if Acct_Type = 'Expense' then do;

    exp_debit_amt = debit_amt;exp_debit_ct = debit_cnt;

    exp_credit_amt = credit_amt;exp_credit_ct = credit_cnt;

    end;else delete;run;

    25

  • 8/6/2019 Sas Basicsppt

    26/60

    ` If-Then-Else Statements Simple logic in order to perform data manipulation

    data data1;set libname1.perm_data1;IF day_of_Week = "MON" THEN ID = 2; ELSE

    IF day_of_Week = "TUE" THEN ID = 3; ELSEIF day_of_Week = "WED" THEN ID = 4; ELSEIF day_of_Week = "THU" THEN ID = 5; ELSEIF day_of_Week = "FRI" THEN ID = 6; ELSEIF day_of_Week = "SAT" THEN ID = 7; ELSEIF day_of_Week = "SUN" THEN ID = 1; ELSEIF day_of_Week = "HOL" THEN ID = 8;

    run;

    26

  • 8/6/2019 Sas Basicsppt

    27/60

    ` Use If-Then-Else statements to subset your data

    Use If condition to keep records with condition

    Break up data set into several data sets by condition

    ` Example of SAS code to subset data:

    DATA excep(RENAME=(sumfield=excepamt)gooddata(DROP=field1 field2);

    SET origdata;

    sumfield = field1 + field2;

    IF sumfield GT 100 THENOUTPUT excep;ELSEOUTPUT gooddata;

    RUN;

    27

  • 8/6/2019 Sas Basicsppt

    28/60

    ` If there are several steps in an If-Then-Else statement, use a Then Do orElse Do

    Use End statement at end of Do statement

    ` Example of code:

    DATA DATAKEEP DATAEXCLUDE;SET DATA1;IF VAR1="KEEP" THEN DO;

    DATASET="DATAKEEP";

    OUTPUT DATAKEEP;

    END;ELSE DO;

    DATASET="DATAEXCLUDE";

    OUTPUT DATAEXCLUDE;

    END;RUN;

    28

  • 8/6/2019 Sas Basicsppt

    29/60

    ` Blanks and Null Values

    There are several ways to flag blank / null records:

    x charvar = ;

    x numvar = .;

    29

  • 8/6/2019 Sas Basicsppt

    30/60

    ` A SAS merge combines records that have equalvalues based on the field(s) in the BYstatement.

    ` The data sets must be sorted by the field(s) in theBYstatement prior to performing the merge.

    `

    The IF statement will select the records to includein the resulting table.

    ` Many to many merges will not have the same resultas a many to many join.

    ` Overlaying of common variables will occur (Update

    will be helpful) Use PROC CONTENTS and RENAME= data setoption

    30

  • 8/6/2019 Sas Basicsppt

    31/60

    31

    SAS Dataset 1

    CodeIn2

    A 1A 1

    B 1

    C 1

    D 1F 1

    SAS Dataset 4

    CodeAmt

    A 9

    B 1

    C .

    E 2

    F .

    3

    1

    SAS Dataset 2

    CodeAmt

    A 1

    B .

    C 2

    D 5

    F 3

    SAS Dataset 2

    CodeIn1

    A 1

    C 1

    C 1

    E 1

    F 1

    F 1

  • 8/6/2019 Sas Basicsppt

    32/60

    Merge SAS Data Set1-2

    CodeIn2 In1A 1 1A 1 1

    B 1 0C 1 1C 1 1D 1 0E 0 1

    F 1 1F 1 1

    32

    Example SAS MERGE.SAS

    Merge SAS Data Set3-4

    Code AmtA 9B 1C .D 5E 2F .

    Update SAS Data Set3-4

    Code Amt

    A 9B 1C 2D 5E 2F 3

  • 8/6/2019 Sas Basicsppt

    33/60

    ` Example of Merge in Data step:

    data data_merge notina notinb;

    merge data1 (in=a) data2 (in=b);

    by field1 field2 field3;if a and b then output data_merge;

    else if a and not b then output notinb;

    else if not a and b then output noina;

    run;

    http://support.sas.com/techsup/technote/ts705.pdf

    33

  • 8/6/2019 Sas Basicsppt

    34/60

    ` Loops allow the users to iterate through a variable or data step ina controlled manner

    ` Two types of loops in SAS Do Until Do While

    ` Must use an END statement

    ` Example of Do Loop code:

    DATA newfile;SET oldfile;

    DO WHILE (condition);task to perform during condition;

    END;RUN;

    34

  • 8/6/2019 Sas Basicsppt

    35/60

    ` CAT, CATS,CATT,CATX

    35

  • 8/6/2019 Sas Basicsppt

    36/60

    ` Count

    36

  • 8/6/2019 Sas Basicsppt

    37/60

    ` Countc

    37

  • 8/6/2019 Sas Basicsppt

    38/60

    ` Ifc

    ifc (condition, value #1, value #2, value #3)

    looks at a condition and returns user specified, character values if the condition is true, false, orresults in a missing value.

    Parameters: condition (numeric expression)

    value #1: char expression if condition is true.

    value #2: char expression if condition is false.

    value #3: char expression if condition results in a missing value.

    ` Ifn

    ifn (condition, value #1, value #2, value #3)

    looks at a condition and returns user specified, numeric values if the condition is true, false, orresults in a missing value.

    Parameters:

    condition (numeric expression)

    value #1: num expression if condition is true.

    value #2: num expression if condition is false.

    value #3: num expression if condition results in a missing value.

    ` ifn comparison to if/then/else

    com= ifn(sales >= quota, sales* mqpct , sales* nmqpct );

    is the same as:

    if sales >= quota then

    com=sales* mqpct

    ` else

    com=sales* nmqpct );

    38

    ifc.txt

  • 8/6/2019 Sas Basicsppt

    39/60

    ` lengthc, lengthm, lengthn

    Differences:

    lengthc: returns the length of a string including trailing blanks.

    lengthm: returns the length of the variable allocated in memory.

    lengthn: returns the length of a string excluding the trailing blanks.

    1. returns 0 if a string is blank, compared to the length function whichreturns 1.

    ` subpad

    parameters:

    -string string to take an excerpt (substring) from.

    -position location of first character in the substring.

    -length the number of characters in the subpad WILL return a variable with the length specified, padding the resultswith spaces.

    subpad can return a string with a length of zero.

    39

  • 8/6/2019 Sas Basicsppt

    40/60

    ` substrn

    parameters:

    -string string to take an excerpt (substring) from.

    -position - location of first character in the substring.

    -length - the number of characters in the

    same as the substr function except:

    substrn truncates the result when length exceeds the length of the string.

    there will be no error messages for invalid third arguments.

    ` ANYALNUM (variable)beats using: indexc(lowcase(variable),qwertyuiopasdfghjklzxcvbnm1234567890)

    ` ANYALPHA (variable)

    beats using: indexc(lowcase(variable),qwertyuiopasdfghjklzxcvbnm)

    ` NOTALNUM (variable)

    beats using: indexc(lowcase(variable),qwertyuiopasdfghjklzxcvbnm1234567890) =0

    40

    substr.txt

  • 8/6/2019 Sas Basicsppt

    41/60

    ` Proc statements are SAS procedures for dataanalysis

    Proc Contents

    Proc Datasets

    Proc Means Proc Summary

    Proc Freq

    Proc Format

    Proc Univariate Proc Print

    41

  • 8/6/2019 Sas Basicsppt

    42/60

    ` Provides field information and data set information

    Field Info: Name, Type, Format, Label

    Data Set Info: Creation date, Number of observations

    Automatically prints to output

    Very useful to perform prior to a merge to ensurevariables from one data set will not write over variablesfrom the other data set.

    ` Example of code:

    PROC CONTENTS DATA=dataset;

    RUN;

    42

  • 8/6/2019 Sas Basicsppt

    43/60

    ` Provides information regarding a library, including datasets in thelibrary

    Can perform the following:

    x Append datasets together

    x Copy datasets to other libraries

    x Delete datasets

    x Modify the contents of a dataset

    x Rename a dataset

    ` Example of code:

    PROC DATASETS LIBRARY=libname;

    RUN;

    43

    Proc dataset.txt

  • 8/6/2019 Sas Basicsppt

    44/60

    ` Computes descriptive statistics on numeric variables in a SASdata set. Results can be output to a new SAS data set. Defaults to print all descriptive statistics (e.g. mean, standard

    deviation, minimum, etc.) Output automatically prints If no numeric variables are specified in the VAR statement, then

    all numeric variables in the data set are analyzed

    ` Example of code:

    PROC MEANS DATA=oldfile;CLASS division dept;

    VAR units value;

    RUN;

    44

  • 8/6/2019 Sas Basicsppt

    45/60

    ` Computes descriptive statistics on numeric variables in a SASdata set and outputs the results into a new SAS data set

    Descriptive statistics must be specified in the OUTPUTstatement

    Output does not automatically print

    If no numeric variables are specified in the VAR statement, then

    a simple count of the observations is generated.

    ` Example of Code:

    PROC SUMMARY DATA=oldfile NWAY;

    CLASS division dept;

    VAR units value;

    OUTPUT OUT=currfile SUM=;

    RUN;

    45

  • 8/6/2019 Sas Basicsppt

    46/60

    ` Produces 1-way to n-way frequency and cross-tabulationtables

    ` Example of Code:

    PRO

    C FREQ DATA=dataset

    NOPRI

    NT;TABLES variable / LIST MISSINGOUT=datanew;

    RUN;

    DATA datanew2;SET datanew;

    IF COUNT > 1 THENOUTPUT datanew2;RUN;

    46

  • 8/6/2019 Sas Basicsppt

    47/60

    ` PROC FORMAT, because of its name, is most often used tochange the appearance of data for presentation.

    ` Example of Code:Proc format;value tempfmt low -< 61 = ' 1'

    61 -< 63 = ' 2'63 - high = ' 3'other = ' ';run;data pme;set pme;tempcode=put(avgtemp, tempfmt.);Run;http://www2.sas.com/proceedings/sugi30/001-30.pdf

    474

    7

    Proc format.txt

  • 8/6/2019 Sas Basicsppt

    48/60

    ` Provides statistics on numeric variables

    Automatically prints the results to output

    ` From the report:

    Number of missing observations

    Number of records > 0, = 0, and

  • 8/6/2019 Sas Basicsppt

    49/60

    ` Print data set to screen Less frequently used if using PC SAS However, when using SAS on the Mainframe, Proc

    Print would allow you to see your data

    *SET EXTERNAL LOG;

    PROC PRINTTO

    PRINT = "&PATH.\PROGRAMS\&CLIENT. &PERIOD. &PROGRAM..DOC"

    LOG = "&PATH.\PROGRAMS\&CLIENT. &PERIOD. &PROGRAM..DOC

    NEW;

    RUN;

    http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a000146809.htm

    49

  • 8/6/2019 Sas Basicsppt

    50/60

    ` What is a macro?

    A macro is a block of code meant for repetitiveruns through the data

    50

  • 8/6/2019 Sas Basicsppt

    51/60

    ` Example of macro code:

    %MACRO Macro_name(DATASET);

    DATA &DATASET.;

    SET &DATASET.;

    IF Var1=. THEN Var1=0;

    RUN;

    %MEND Macro_name;

    %Macro_name(TOTAL);

    51

  • 8/6/2019 Sas Basicsppt

    52/60

    SAS Output

    52

  • 8/6/2019 Sas Basicsppt

    53/60

    ` There are several ways you can export data:

    Proc Export

    ODS (Output Delivery System)

    DDE (Data Dynamic Exchange)

    53

  • 8/6/2019 Sas Basicsppt

    54/60

    ` Proc Export is the mostcommon way to export data

    ` Very similar to the ProcImport step, but works in

    the reverse` SAS can export to the

    following files:

    Excel

    Access

    DBF Text

    54

    Example

    SAS Program appended

    at #17

  • 8/6/2019 Sas Basicsppt

    55/60

    55

    DOC File

    Inven ory Data for SAS Basi s.txt

    Input Data

    SAS Basi s Exer ise1 Solution.sas

    Input Data

    Exercise1 Exercise2

    DOC File

    AR Inventory for SAS Training.txt AR Data SAS Training.txt

    SAS Basi s Exer ise2 Solution.sas

  • 8/6/2019 Sas Basicsppt

    56/60

    Debugging and Checking SAS

    Programs

    56

  • 8/6/2019 Sas Basicsppt

    57/60

    ` It is essential to check your SAS log for errors:

    Error (in red): Very Bad!

    Warning (in green): Maybe bad!

    ` SAS will not stop its processing if it finds errors!

    57

  • 8/6/2019 Sas Basicsppt

    58/60

    ` Lets run Debug Program.sas to see different types oferrors.

    ` Things to look out for: Colors or perform search for error and warning

    Dropped records or observations

    ` Visualize and understand how your results shouldlook. Did the results turn out as planned?

    5858

    Use inventory raw file appended on slide #50SAS Basics ebug Program.sas

  • 8/6/2019 Sas Basicsppt

    59/60

    ` proc SQL

    ` Macros

    ` Proc Tabulate/Report

    ` Merge using Proc format

    59

  • 8/6/2019 Sas Basicsppt

    60/60