How to Select a Random Sample in SAS (With Examples)

Here are the two most common ways to select a simple random sample of rows from a dataset in SAS:

Method 1: Select Random Sample Using Sample Size

```proc surveyselect data=original_data
out=random_sample
method=srs /*specify simple random sampling as sampling method*/
sampsize=3 /*select 3 observations randomly*/
seed=123; /*set seed to make this example reproducible*/
run;
```

Method 2: Select Random Sample Using Proportion of Total Observations

```proc surveyselect data=original_data
out=random_sample
method=srs /*specify simple random sampling as sampling method*/
samprate=0.2 /*select 20% of all observations randomly*/
seed=123; /*set seed to make this example reproducible*/
run;```

The following examples show how to use each method with the following dataset in SAS:

```/*create dataset*/
data original_data;
input team \$ points rebounds;
datalines;
Warriors 25 8
Wizards 18 12
Rockets 22 6
Celtics 24 11
Thunder 27 14
Spurs 33 19
Nets 31 20
Mavericks 34 10
Kings 22 11
Pelicans 39 23
;
run;

/*view dataset*/
proc print data=original_data;```

Example 1: Select Random Sample Using Sample Size

The following code shows how to select a random sample of observations from the dataset using a sample size of n=3:

```/*select random sample*/
proc surveyselect data=original_data
out=random_sample
method=srs
sampsize=3
seed=123;
run;

/*view random sample*/
proc print data=random_sample;```

We can see that three rows were randomly selected from the original dataset.

Example 2: Select Random Sample Using Proportion of Total Observations

The following code shows how to select a random sample of observations from the dataset by using the samprate function to specify that we’d like the random sample to represent 20% of all original observations:

```/*select random sample*/
proc surveyselect data=original_data
out=random_sample
method=srs
samprate=0.2
seed=123;
run;

/*view random sample*/
proc print data=random_sample;
```

We can see that 20% of the total observations (20% * 10 observations = 2) from the original dataset were randomly selected to be in our sample.