How to Use PROC SURVEYSELECT in SAS (With Examples)


You can use PROC SURVEYSELECT to select a random sample from a dataset in SAS.

Here are three common ways to use this procedure in practice:

Example 1: Use PROC SURVEYSELECT to Select Simple Random Sample

proc surveyselect data=my_data
    out=my_sample
    method=srs    /*use simple random sampling*/
    n=5           /*select a total of 5 observations*/
    seed=1;       /*set seed to make this example reproducible*/
run;

This particular example selects 5 random observations from the entire dataset.

Example 2: Use PROC SURVEYSELECT to Select Stratified Random Sample

proc surveyselect data=my_data
    out=my_sample
    method=srs           /*use simple random sampling*/
    n=2                  /*select 2 observations from each strata*/
    seed=1;              /*set seed to make this example reproducible*/
    strata grouping_var; /*specify variable to use for stratification*/
run;

This particular example selects 2 random observations from each unique stratum in the dataset.

The strata statement specifies the variable to use for stratification.

Example 3: Use PROC SURVEYSELECT to Select Clustered Random Sample

proc surveyselect data=my_data
    out=my_sample
    n=2                   /*select 2 clusters*/
    seed=1;               /*set seed to make this example reproducible*/
    cluster grouping_var; /*specify variable to use for stratification*/
run;

This particular example selects 2 random clusters from the dataset and includes every observation from each cluster in the sample.

The cluster statement specifies the variable to use for clustering.

The following examples show how to use each method in practice with the following dataset in SAS that contains information about basketball players on various teams:

/*create dataset*/
data my_data;
    input team $ points;
    datalines;
A 12
A 14
A 22
A 35
A 40
B 12
B 10
B 29
B 33
C 40
C 25
C 11
C 10
C 15
;
run;

/*view dataset*/
proc print data = my_data;

Example 1: Use PROC SURVEYSELECT to Select Simple Random Sample

We can use the following syntax to select a simple random sample of 5 observations from the entire dataset:

proc surveyselect data=my_data
    out=my_sample
    method=srs    /*use simple random sampling*/
    n=5           /*select a total of 5 observations*/
    seed=1;       /*set seed to make this example reproducible*/
run;

/*view sample*/
proc print data=my_sample;

The resulting sample contains 5 observations randomly chosen from the entire dataset.

Example 2: Use PROC SURVEYSELECT to Select Stratified Random Sample

We can use the following syntax to perform stratified random sampling in which 2 observations are randomly chosen from each team to be included in the sample:

proc surveyselect data=my_data
    out=my_sample
    method=srs    /*use simple random sampling within strata*/
    n=2           /*select 2 observations from each strata*/
    seed=1;       /*set seed to make this example reproducible*/
    strata grouping_var; /*specify variable to use for stratification*/
run;

/*view sample*/
proc print data=my_sample;

The resulting sample contains 2 observations randomly chosen from each team.

Related: Cluster Sampling vs. Stratified Sampling: What’s the Difference?

Example 3: Use PROC SURVEYSELECT to Select Clustered Random Sample

We can use the following syntax to perform clustered random sampling in which we use the teams as clusters and randomly select 2 clusters and include each observation from those clusters in the sample:

proc surveyselect data=my_data
    out=my_sample
    n=2           /*select a total of 2 clusters*/
    seed=1;       /*set seed to make this example reproducible*/
    cluster grouping_var; /*specify variable to use for clustering*/
run;

/*view sample*/
proc print data=my_sample;

This particular sample contains every observation from teams A and B, which were the two “clusters” randomly chosen.

Note: You can find the complete documentation for PROC SURVEYSELECT here.

Additional Resources

The following tutorials explain how to perform other common tasks in SAS:

How to Calculate Descriptive Statistics in SAS
How to Create Frequency Tables in SAS
How to Calculate Percentiles in SAS
How to Create Pivot Tables in SAS

One Reply to “How to Use PROC SURVEYSELECT in SAS (With Examples)”

Leave a Reply

Your email address will not be published. Required fields are marked *