SAS: How to Use EXCEPT in PROC SQL


You can use the EXCEPT operator in the PROC SQL statement in SAS to only return rows from one dataset that are not in another dataset.

The following example shows how to use the EXCEPT operator in practice.

Example: Using EXCEPT in PROC SQL in SAS

Suppose we have the following dataset in SAS that contains information about various basketball players:

/*create first dataset*/
data data1;
    input team $ points;
    datalines;
A 12
A 14
A 15
A 18
A 20
A 22
;
run;

/*view first dataset*/
proc print data=data1;

And suppose we have another dataset in SAS that also contains information about various basketball players:

/*create second dataset*/
data data2;
    input team $ points;
    datalines;
A 12
A 14
B 23
B 25
B 29
B 30
;
run;

/*view second dataset*/
proc print data=data2;

We can use the EXCEPT operator in the PROC SQL statement to only return the rows from the first dataset that are not in the second dataset

/*only return rows from first dataset that are not in second dataset*/
proc sql;
   title 'data1 EXCEPT data2';
   select * from data1
   except
   select * from data2;
quit;

Notice that only the rows in the first dataset that do not belong to the second dataset are returned.

We can also use the EXCEPT operator to only return the rows from the second dataset that are not in the first dataset:

/*only return rows from second dataset that are not in first dataset*/
proc sql;
   title 'data2 EXCEPT data1';
   select * from data2
   except
   select * from data1;
quit;

Notice that only the rows in the second dataset that do not belong to the first dataset are returned.

Additional Resources

The following tutorials explain how to perform other common tasks in SAS:

SAS: How to Use UNION in PROC SQL
SAS: How to Use Proc Univariate by Group
SAS: How to Use Proc Contents

One Reply to “SAS: How to Use EXCEPT in PROC SQL”

  1. Is there any way to tell EXCEPT to stop checking the first dataset once it has removed a row? I have two LARGE datasets, A and B, and want to find A EXCEPT B. Most rows in A should be in B, and there are no duplicate rows, and I feel like it keeps checking for more rows in A that match the current row in B and wasting a lot of time.

Leave a Reply

Your email address will not be published. Required fields are marked *