How to Subset Data in SAS (3 Examples)


Here are the three most common ways to subset a dataset in SAS:

Method 1: Choose Which Columns to Keep

data new_data;
    set original_data;
    keep var1 var3;
run;

Method 2: Choose Which Columns to Drop

data new_data;
    set original_data;
    drop var4;
run;

Method 3: Choose Which Rows to Keep Based on Condition

data new_data;
    set original_data;
    if var1 < 25 then delete;
run;

The following examples show how to use each method with the following dataset in SAS:

/*create dataset*/
data original_data;
    input team $ points rebounds;
    datalines;
Warriors 25 8
Wizards 18 12
Rockets 22 6
Celtics 24 11
Thunder 27 14
Spurs 33 19
Nets 31 20
;
run;

/*view dataset*/
proc print data=original_data;

Example 1: Choose Which Columns to Keep

The following code shows how to subset a dataset by using the KEEP statement to keep only certain columns:

/*create new dataset*/
data new_data;
    set original_data;
    keep team points;
run;

/*view new dataset*/
proc print data=new_data;

Example 2: Choose Which Columns to Drop

The following code shows how to subset a dataset by using the DROP statement to drop specific columns:

/*create new dataset*/
data new_data;
    set original_data;
    drop points;
run;

/*view new dataset*/
proc print data=new_data;

Example 3: Choose Which Rows to Keep Based on Condition

The following code shows how to subset a dataset by using the DELETE statement to drop specific rows from the dataset where the value in the points column is less than 25:

/*create new dataset*/
data new_data;
    set original_data;
    if points < 25 then delete;
run;

/*view new dataset*/
proc print data=new_data;

You can also use the OR|” operator to drop the rows where points is less than 25 or rebounds is less than 10:

/*create new dataset*/
data new_data;
    set original_data;
    if points < 25 | rebounds < 10 then delete;
run;

/*view new dataset*/
proc print data=new_data;

You can also use the AND&” operator to drop the rows where points is less than 25 and rebounds is less than 10:

/*create new dataset*/
data new_data;
    set original_data;
    if points < 25 & rebounds < 10 then delete;
run;

/*view new dataset*/
proc print data=new_data;

Additional Resources

The following tutorials explain how to perform other common tasks in SAS:

How to Rename Variables in SAS
How to Create New Variables in SAS
How to Replace Characters in a String in SAS

Leave a Reply

Your email address will not be published. Required fields are marked *