Data  Processing  with  PC-­‐SAS   PubH  6325  

J.  Michael  Oakes,  PhD  

 

Associate  Professor   Division  of  Epidemiology   University  of  Minnesota   [email protected]  

  Lecture  2/4  

Lecture  2    

•   DATA  STEP  &      PROGRAMMING  BASICS     •   BOOLEAN  LOGIC   •   SAS  OPERATORS   •   SUBSETTING  DATA   •   LABELLING  VARIABLES,  DATA   •   SIMPLE  PROCEDURES    

PC  SAS    

Data  

SAS  Program  

SAS  Reads  Data   as  per  instruc1ons  

Output   SAS  Writes  Data     and/or  Text     as  per  instruc1ons  

SAS  Fundamentals   SAS  does  exactly(!)  the  stuff  you  tell        it  to  do  in  your  SAS  program.  

It  cannot  read  your  mind!  

SAS  Fundamentals   The  devil  is  in  the  syntax!  

(Sarcophilus  harrisi)  

SAS  Fundamentals   Basic  Programming  Stuff:   Statements  must  end  with            a  semi-­‐colon  ;   UPPER  or  lower  case  does        not  maaer  

SAS  Fundamentals   Statements  may  take  several  lines   Variable  names  must  be  1-­‐32  (8  is      best)  characters  and  begin  with  a      leaer  or  an  underscore,  _  

SAS  Programming   Two  categories  of  SAS  statements/commands:  

Data  Step    (basically  data  mgt)   Procedure  Step  (basically  analysis)  

Data  Step  Processing   SUBMIT DATA STEP PROGRAM

COMPILE

End Data Step PROGRAM

CREATE

Set missing values

Input Buffer

PDV

process

DATA

statement

NO

read

INPUT

YES

RECORD TO READ?

record

execute other STATEMENTS

WRITES

observation to SAS data

RETURN

Descript. Info

Data  Step  Processing   SUBMIT DATA STEP PROGRAM

COMPILE

Data  step  begins  with  the  DATA  statement   in  your  program.    

In  this  phase,  SAS  checks  the  syntax  of  the  SAS   statements  and  compiles  them,  that  is,  automaecally   translates  the  statements  into  machine  code.     SAS  then  idenefies  the  type  and  length  of  each  new   variable,  and  determines  whether  a  type  conversion  is   necessary  for  each  subsequent  reference  to  a  variable.    

Data  Step  Processing    

CREATE

Input Buffer

PDV

Descript. Info

In  this  phase,  SAS  creates:   Input  Buffer:  A  logical  area  in  RAM  into  which  SAS  reads  each  record  of  raw  data  when   SAS  reads  raw  data.       Program  Data  Vector  (PDV):  A  logical  area  in  RAM  where  SAS  builds  a  data  set,  one   observa1on  at  a  1me.  From  here,  SAS  writes  the  values  to  a  SAS  data  set  as  a  single   observa1on.  Along  with  data  set  variables  and  newly  computed  variables,  the  PDV   contains  two  automa1c  variables,  _N_  and  _ERROR_.       Descriptor  Informa1on:  Informa1on  that  SAS  creates  and  maintains  about  each  SAS  data   set,  including  data  set  aPributes  and  variable  aPributes.  It  contains,  for  example,  the   name  of  the  data  set  and  its  member  type,  the  date  and  1me  that  the  data  set  was   created,  and  the  number,  names  and  data  types  (character  or  numeric)  of  the  variables.  

Data  Step  Processing   data total_points (drop=TeamName); input TeamName $ ParticipantName $ Event1 Event2 Event3; TeamTotal = (Event1 + Event2 + Event3); datalines; Knights Sue 6 8 8 Cardinals Jane 9 7 8 Knights John 7 7 7 Knights Lisa 8 9 9 Knights Fran 7 6 6 Knights Walter 9 8 10; Run;

Data  Step  Processing   Knights Sue 6 8 8!

TeamName  

Par1cipantName  

Event1  

Event2  

Event3  

Drop  

Build  PDV  for  Named  Variables  

TeamTotal  

_N_  

_ERROR_  

Drop  

Drop  

Data  Step  Processing   Knights Sue 6 8 8!

Set missing values

TeamName  

Par1cipantName  

.   .  

Event1  

Event2  

.  

Event3  

Drop  

TeamTotal  

0  

_N_  

_ERROR_  

1  

0  

Drop  

Fill-­‐in  PDV  place-­‐holders  for  variables  

Drop  

Data  Step  Processing   Knights  Sue  6  8  8  

TeamName  

Par1cipantName  

Event1  

Event2  

Sue  

6  

8  

Knights  

Event3  

8  

Drop  

TeamTotal  

_N_  

_ERROR_  

0  

1  

0  

Drop  

read

INPUT record

Fill  PDV  with  “data”  

Drop  

Data  Step  Processing   Knights Sue 6 8 8!

TeamName  

Knights  

Par1cipantName  

Event1  

Event2  

Sue  

6  

8  

Drop  

Event3  

8  

TeamTotal  

22  

_N_  

_ERROR_  

1  

0  

Drop  

Calculate  “TeamTotal”  variable  

execute other STATEMENTS

Drop  

Data  Step  Processing   Knights Sue 6 8 8!

Par1cipantName  

Event1  

Event2  

Sue  

6  

8  

Write/Output  to  SAS  dataset  

WRITES

observation to SAS data

Event3  

8  

TeamTotal  

22  

Data  Step  Processing   Cardinals Jane 9 7 8!

TeamName  

Par1cipantName  

.   .  

Event1  

Event2  

.  

Event3  

Drop  

TeamTotal  

0  

_N_  

_ERROR_  

2  

0  

Drop  

Return  and  set  _N_  to  2,  Repeat  Sequence  

RETURN

Drop  

Knights Sue 6 8 8! TeamName  

Par1cipantName  

Drop   TeamName  

Knights  

Par1cipantName  

Sue  

.   .   .   .   .   .  

Event1  

Event1  

6  

Event2  

Event2  

8  

Event3  

Event3  

8  

TeamTotal  

0  

Knights  

1  

_ERROR_  

0  

Drop   Drop   TeamTotal   0  

0  

Drop   TeamName  

_N_  

_N_   1  

1  

_ERROR_   0  

0  

Drop   Drop   Par1cipantName  

Sue  

Event1  

6  

Event2  

8  

Event3  

TeamTotal  

8  

22  

Drop  

_N_  

1  

_ERROR_  

0  

Drop   Drop   Par1cipantName  

Sue  

Event1  

6  

Event2  

8  

Event3  

TeamTotal  

8  

22  

Cardinals Jane 9 7 8! TeamName  

Drop  

Par1cipantName  

.   .   .  

Event1  

Event2  

Event3  

TeamTotal  

0  

_N_  

2  

_ERROR_  

0  

Drop   Drop  

Data  Step  Processing   SUBMIT DATA STEP PROGRAM

COMPILE

End Data Step PROGRAM

CREATE

Set missing values

Input Buffer

PDV

process

DATA

statement

NO

read

INPUT

YES

RECORD TO READ?

record

execute other STATEMENTS

WRITES

observation to SAS data

RETURN

Descript. Info

Data  Step  Processing  

Programming  (very)  Basics   Programs:    

•       Document  tasks     •       Permit  replicaeon  

Programming  (very)  Basics   Good  programming  praceces:    

•     Comment  on  name  of  prog   •   Date  wriaen   •   Author   •     Purpose   •   Use  comments  ohen    

Programming  Basics   Example:   ***This code appears in Chapter 1 of SAS Programming by Example.*** *** Example 1 ***; DATA LISTINP; INPUT ID HEIGHT WEIGHT GENDER $ AGE; DATALINES; 1 68 144 M 23 2 78 202 M 34 3 62 99 F 37 4 61 101 F 45 ; PROC PRINT DATA=LISTINP; TITLE 'Example 1'; RUN;

Programming  Basics   Beaer  Example:   * analysis.sas

;

* Program runs analyses on Ed Kaplan's Strep Data

;

* Originally written on 12/12/01

;

JMO

**********************************************************; data comb; set kaplan.combined; proc nlmixed data=comb; parms beta0=-1 s2u=1; eta=beta0 + u; expeta=exp(eta); p=expeta/(1+expeta); model endpoint ~ binary(p); random u ~ normal(0,s2u) subject=inv; estimate 'sigma2' s2u; run;

Reading  SAS  Data   A  DATA  statement  “writes”  SAS  data   To  read  in  exiseng  SAS  data…  

Use  the  set  command…    

PC  SAS    

Data  

SAS  Program  

SAS  Reads  Data   as  per  instruc1ons  

Output   SAS  Writes  Data     and/or  Text     as  per  instruc1ons  

Reading  SAS  Data   Version  stuff:  

Data  Library  Engines  –      

Indicate  which  version      of  SAS  you  want  to  read      from  and/or  write  to.      

 

Wrieng  SAS  Data   Two  ways  to  go:        

 

 

Temporary  SAS  file  

   Permanent  SAS  file  

Wrieng  SAS  data   Temporary  SAS  file              dulldata              work.dulldata   Where  the  heck  is  the  work.sas  directory?  

Boolean  Logic   Or  |          Not  !            And  &  

George  Boole,  FRS   (1815-­‐1864)    

A  

B  

C   John  Venn,  FRS   (1834-­‐1923)    

SAS  Operators   SAS  operators  are  symbols  that  request  a  comparison,  a  logical  operaeon,  an   arithmeec  calculaeon,  or  a  concatenaeon.  

   

+  Addieon     -­‐  Subtraceon   *  Muleplicaeon     /  Division       **  Exponeneaeon ||  Concatenate  ‘

 a+b    a-­‐b    a*b    a/b    a**b    a’||  ‘b’    yields  ‘ab’  

SAS  Operators  

   

<   < <= >= = ~=   >< <>

 LT  GT  LE  GE  EQ  NE

 less  than          greater  than        less  than  or  equal  to    greater  than  or  equal  to    equal  to          not  equal  to      

 ab    a<=b  a>=b  a  =  b    a  ~=b

 MIN  minimum  of    MAX  maximum  of  

   

 z=(a>b)  

 z=(a  MIN  b)    z=(a  MAX  b)  

& |

 AND  Boolean  “and”  OR  Boolean  “or”  

   

   

   

 a  &  b  a  |  b  

           

 a  LT  b    a  GT  b    a  LT  b    a  GE  b    a  EQ  b    a  NE  b  

 a  and  b    a  or  b  

Manipulaeng  SAS  Data  Sets      

Subsewng

if     where   obs=     keep   drop  

Manipulaeng  a  SAS  Dataset     Generate  a  new  variable   Formats   Rename   Variable  Labels  

SAS  Procedures     Procedure  Steps   •   A  proc  statement  runs  a  SAS  proc.   •   Most  procs  use  data  created  in  the  data  step.   •   The  syntax  for  most  procs  is  about  the  same.  

SAS  Procedures     Proc  Contents   The  CONTENTS  procedure  prints  the  contents  of  a  SAS  data  set  (to  output  file).  

SAS  Procedures     Proc  Print   The  PRINT  procedure  prints  the  observaeons  (i.e.,  data)  in  a  SAS   data  set,  using  all  or  some  of  the  variables  as  you  select,  (to  the   Output  file).      

Example  programs    See  ‘day  2  programs.sas’  

Subsewng,  Manipulaeng,  Generaeng,  Labeling  

Lab  2   (First  Hour)  Directed  Learning  

•   SAS  program  wrieng  and  saving   •   Reading  and  wrieng  SAS  data   •   Basic  Subsewng  examples   •   Labeling  variables,  datasets   •   Basic  procs  (contents,  print)        

(Second  Hour)  Lab  Assignment  

 

•   Write  professional  quality  SAS  program     to  read  and  write  simple  data,  subset   and  manipulate  as  per  direceons.    

Lecture 2 of 4.pdf

Page 1 of 40. Data Processing with PC-SAS. PubH 6325. J. Michael Oakes, PhD. Associate Professor. Division of Epidemiology. University of Minnesota.

247KB Sizes 0 Downloads 229 Views

Recommend Documents

Week 2 Lecture Material.pdf
Page 5 of 107. 5. Three-valued logic. Fuzzy connectives defined for such a three-valued logic better can. be stated as follows: Symbol Connective Usage Definition. NOT. OR. AND. IMPLICATION. EQUALITY. Debasis Samanta. CSE. IIT Kharagpur. Page 5 of 10

Econ 712 Lecture 2
where t. G is another Hilbert space. The dimension of this Hilbert space is either 0 or 1. This is so because the Hilbert space t. G must be spanned by the single.

2-TLC-Lecture note.pdf
Page 4 of 8. Page 4 of 8. 2-TLC-Lecture note.pdf. 2-TLC-Lecture note.pdf. Open. Extract. Open with. Sign In. Main menu. Displaying 2-TLC-Lecture note.pdf.

phys570-lecture-2.pdf
Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. phys570-lecture-2.pdf. phys570-lecture-2.pdf. Open. Extract.

EE 396: Lecture 2
Feb 12, 2011 - where I : Ω → R is the observed or measured image, and α > 0. The energy E is a functional that is defined on the class of functions U, which ...

Econ 712 Lecture 2
the natural metric, i.e. notion of length, for a random variable its standard deviation,. ( )2. = E t t x x. (5.1) with covariance as the associated notion of inner product ...

lecture 2: intro to statistics - GitHub
Continuous Variables. - Cumulative probability function. PDF has dimensions of x-1. Expectation value. Moments. Characteristic function generates moments: .... from realized sample, parameters are unknown and described probabilistically. Parameters a

Old Dominion University Lecture 2 - GitHub
Old Dominion University. Department of ... Our Hello World! [user@host ~]$ python .... maxnum = num print("The biggest number is: {}".format(maxnum)) ...

Lecture 2: Measuring Firm Heterogeneity
Oct 23, 2017 - Not a trivial issue: input-output linkages, firm-to-firm trade relationships, etc. • ACF doesn't work in this case. • Recall that mit = m(kit,lit,ωit). • If mit is directly an input factor in gross production function, which var

CSE342/542 -‐ Lecture 2
18. Metric. Formula. Average classifica on accuracy ... Precision: Frac on of retrieved instances that are relevant ... What does the precision score of 1.0 mean?

pdf-1490\record-of-agard-lecture-series-lecture ...
... the apps below to open or edit this item. pdf-1490\record-of-agard-lecture-series-lecture-series-i ... unne-j-c-north-atlantic-treaty-organization-vannucci.pdf.

AP Physics 2 Lecture Notes 2015-2016.pdf
B. Example # 4 (2004 AP Physics B) While exploring a sunken ocean liner, the. principal researcher found the absolute pressure on the robot observation.

Evaluation de lecture trimestre 2 cuissart a.pdf
On. fera une route neuve. Page 3 of 7. Evaluation de lecture trimestre 2 cuissart a.pdf. Evaluation de lecture trimestre 2 cuissart a.pdf. Open. Extract. Open with.

A Lecture on Compressive Sensing 1 Scope 2 ...
The ideas presented here can be used to illustrate the links between data .... a reconstruction algorithm to recover x from the measurements y. Initially ..... Baraniuk, “Analog-to-information conversion via random demodulation,” in IEEE Dallas.

Lecture 7
Nov 22, 2016 - Faculty of Computer and Information Sciences. Ain Shams University ... A into two subsequences A0 and A1 such that all the elements in A0 are ... In this example, once the list has been partitioned around the pivot, each sublist .....

A Lecture on Compressive Sensing 1 Scope 2 ...
Audio signals and many communication signals are compressible in a ..... random number generator (RNG) sets the mirror orientations in a pseudorandom 0/1 pattern to ... tion from highly incomplete frequency information,” IEEE Trans. Inform.

Computer Science E-259 XML with Java Lecture 2
Sep 24, 2007 - This is an XML document that describes students -->.

Public Lecture - Ariel Fernandez 2.pdf
Page 1 of 1. The Department of Modern Languages. and Literatures. The U.W.I., Mona Campus. invites you to a. Public Lecture. by. His Excellency Ariel Fernández. Ambassador of the Republic of Argentina to Jamaica. Title: “The Community of Latin Ame