How to Select a Random Sample in SAS (With Examples)


Here are the two most common ways to select a simple random sample of rows from a dataset in SAS:

Method 1: Select Random Sample Using Sample Size

proc surveyselect data=original_data
    out=random_sample
    method=srs /*specify simple random sampling as sampling method*/
    sampsize=3 /*select 3 observations randomly*/
    seed=123; /*set seed to make this example reproducible*/
run;

Method 2: Select Random Sample Using Proportion of Total Observations

proc surveyselect data=original_data
    out=random_sample
    method=srs /*specify simple random sampling as sampling method*/
    samprate=0.2 /*select 20% of all observations randomly*/
    seed=123; /*set seed to make this example reproducible*/
run;

The following examples show how to use each method with the following dataset in SAS:

/*create dataset*/
data original_data;
    input team $ points rebounds;
    datalines;
Warriors 25 8
Wizards 18 12
Rockets 22 6
Celtics 24 11
Thunder 27 14
Spurs 33 19
Nets 31 20
Mavericks 34 10
Kings 22 11
Pelicans 39 23
;
run;

/*view dataset*/
proc print data=original_data;

Example 1: Select Random Sample Using Sample Size

The following code shows how to select a random sample of observations from the dataset using a sample size of n=3:

/*select random sample*/
proc surveyselect data=original_data
    out=random_sample
    method=srs
    sampsize=3
    seed=123;
run;

/*view random sample*/
proc print data=random_sample;

We can see that three rows were randomly selected from the original dataset.

Example 2: Select Random Sample Using Proportion of Total Observations

The following code shows how to select a random sample of observations from the dataset by using the samprate function to specify that we’d like the random sample to represent 20% of all original observations:

/*select random sample*/
proc surveyselect data=original_data
    out=random_sample
    method=srs
    samprate=0.2
    seed=123;
run;

/*view random sample*/
proc print data=random_sample;

We can see that 20% of the total observations (20% * 10 observations = 2) from the original dataset were randomly selected to be in our sample.

Additional Resources

The following tutorials explain how to perform other common tasks in SAS:

How to Use Proc Summary in SAS
How to Rename Variables in SAS
How to Create New Variables in SAS

Leave a Reply

Your email address will not be published.