SAS: How to Merge If A Not B


You can use the following basic syntax to merge two datasets in SAS and only return the rows where a value exists in the first dataset and not the second dataset:

data final_data;
  merge data1 (in = a) data2 (in = b);
  by ID;
  if a and not b;
run;

This particular example merges the datasets called data1 and data2 and only returns the rows where a value exists in data1 and not in data2.

The following example shows how to use this syntax in practice.

Example: Merge Two Datasets in SAS Using If A Not B

Suppose we have the following two datasets in SAS that contain information about sales associates at some company:

/*create first dataset*/
data data1;
    input ID Gender $;
    datalines;
1 Male
2 Male
3 Female
4 Male
5 Female
;
run;

title "data1";
proc print data = data1;

/*create second dataset*/
data data2;
    input ID Sales;
    datalines;
1 22
2 15
4 29
6 31
7 20
8 13
;
run;

title "data2";
proc print data = data2;

If we use the following merge statement to merge the two datasets based on the value in the ID column, all rows will be returned from both datasets:

/*perform merge*/
data final_data;
  merge data1 data2;
  by ID;
run;

/*view results*/
title "final_data";
proc print data=final_data;

However, we can use IN statements to only return the rows where a value exists in the first dataset and not in the second dataset:

data final_data;
  merge data1 (in = a) data2 (in = b);
  by ID;
  if a and not b;
run;

/*view results*/
title "final_data";
proc print data=final_data;

Notice that only the rows where a value exists in the first dataset and not the second dataset are returned.

Note: You can find the complete documentation for the SAS merge statement here.

Additional Resources

The following tutorials explain how to perform other common tasks in SAS:

How to Perform a Left Join in SAS
How to Perform an Inner Join in SAS
How to Perform an Outer Join in SAS

Leave a Reply

Your email address will not be published. Required fields are marked *