You can use the following basic syntax to merge two datasets in SAS based on two variables matching:
data final_data;
merge data1 (in = a) data2 (in = b);
by ID Store;
if a and b;
run;
This particular example merges the datasets called data1 and data2 based on the variables called ID and Store and only returns the rows where a value exists in both datasets.
The following example shows how to use this syntax in practice.
Example: Merge Datasets in SAS Based on Two Variables
Suppose we have the following dataset in SAS that contains information about sales associates at some company:
/*create first dataset*/
data data1;
input ID Store $;
datalines;
1 A
1 B
1 C
2 A
2 C
3 A
3 B
;
run;
/*view first dataset*/
title "data1";
proc print data = data1;
And suppose we have another dataset that contains information about the sales made at various stores by each associate:
/*create second dataset*/
data data2;
input ID Store $ Sales;
datalines;
1 A 22
1 B 25
2 A 40
2 B 24
2 C 29
3 A 12
3 B 15
;
run;
/*view second dataset*/
title "data2";
proc print data = data2;
We can use the following merge statement to merge the two datasets based on matching values in the ID and Store columns, then only return rows where a value exists in both columns:
/*perform merge*/
data final_data;
merge data1 (in = a) data2 (in = b);
by ID Store;
if a and b;
run;
/*view results*/
title "final_data";
proc print data=final_data;
The resulting dataset returns the rows where the values in the ID and Store columns both match.
Note: You can find the complete documentation for the SAS merge statement here.
Additional Resources
The following tutorials explain how to perform other common tasks in SAS:
SAS: How to Perform One-to-Many Merge
SAS: How to Use (in=a) in Merge Statement
SAS: How to Merge If A Not B