Here are the three most common ways to subset a dataset in SAS:
Method 1: Choose Which Columns to Keep
data new_data;
set original_data;
keep var1 var3;
run;
Method 2: Choose Which Columns to Drop
data new_data;
set original_data;
drop var4;
run;
Method 3: Choose Which Rows to Keep Based on Condition
data new_data;
set original_data;
if var1 < 25 then delete;
run;
The following examples show how to use each method with the following dataset in SAS:
/*create dataset*/
data original_data;
input team $ points rebounds;
datalines;
Warriors 25 8
Wizards 18 12
Rockets 22 6
Celtics 24 11
Thunder 27 14
Spurs 33 19
Nets 31 20
;
run;
/*view dataset*/
proc print data=original_data;
Example 1: Choose Which Columns to Keep
The following code shows how to subset a dataset by using the KEEP statement to keep only certain columns:
/*create new dataset*/
data new_data;
set original_data;
keep team points;
run;
/*view new dataset*/
proc print data=new_data;
Example 2: Choose Which Columns to Drop
The following code shows how to subset a dataset by using the DROP statement to drop specific columns:
/*create new dataset*/
data new_data;
set original_data;
drop points;
run;
/*view new dataset*/
proc print data=new_data;
Example 3: Choose Which Rows to Keep Based on Condition
The following code shows how to subset a dataset by using the DELETE statement to drop specific rows from the dataset where the value in the points column is less than 25:
/*create new dataset*/
data new_data;
set original_data;
if points < 25 then delete;
run;
/*view new dataset*/
proc print data=new_data;
You can also use the OR “|” operator to drop the rows where points is less than 25 or rebounds is less than 10:
/*create new dataset*/
data new_data;
set original_data;
if points < 25 | rebounds < 10 then delete;
run;
/*view new dataset*/
proc print data=new_data;
You can also use the AND “&” operator to drop the rows where points is less than 25 and rebounds is less than 10:
/*create new dataset*/
data new_data;
set original_data;
if points < 25 & rebounds < 10 then delete;
run;
/*view new dataset*/
proc print data=new_data;
Additional Resources
The following tutorials explain how to perform other common tasks in SAS:
How to Rename Variables in SAS
How to Create New Variables in SAS
How to Replace Characters in a String in SAS