Finding unique and dublicates in SAS

The code below shows you how to find unique and duplicate values in a dataset and get them seperated into two different datasets.
The variables you want to examin for uniqueness has to be in the by-statement and each have an not(first.<variable> and last.variable). Be aware that in SAS 9.3 there is an easier solution using proc sort.

data unique dups;
 set sashelp.class;
 by Age Height Name Weight;
 if not(first.Age and last.Age) 
 and not(first.Height and last.Height) 
 and not(first.Name and last.Name)
 and not(first.weight and last.Weight) then output dups;
 else output unique;
run;

This code is different than using proc sort prior to SAS 9.3

proc sort data=sashelp.class nodupkey out=unique dupout=dups;
 by Age Height Name Weight;
run;

The code above will take the first of the dublicates and put it into the unique-dataset. It will not completely seperate unique and duplicate rows from each other.

In SAS 9.3 proc sort has a new parameter uniqueout. This can be used to do the trick of the datastep much easier. I haven’t tried it, but I imagine that this is how it works.

proc sort data=sashelp.class nouniquekeys uniqueout=singles out=dublet; 
 by Age Height Name Weight;
run;

 

Leave a Reply

Your email address will not be published. Required fields are marked *