How to Use the SUBSTR Function in SAS (With Examples)


You can use the SUBSTR function in SAS to extract a portion of a string.

This function uses the following basic syntax:

SUBSTR(Source, Position, N)

where:

  • Source: The string to analyze
  • Position: The starting position to read
  • N: The number of characters to read

Here are the four most common ways to use this function:

Method 1: Extract First N Characters from String

data new_data;
    set original_data;
    first_four = substr(string_variable, 1, 4);
run;

Method 2: Extract Characters in Specific Position Range from String

data new_data;
    set original_data;
    two_through_five = substr(string_variable, 2, 4);
run;

Method 3: Extract Last N Characters from String

data new_data;
    set original_data;
    last_three = substr(string_variable, length(string_variable)-2, 3);
run;

Method 4: Create New Variable if Characters Exist in String

data new_data;
    set original_data;
    if substr(string_variable, 1, 4) = 'some_string' then new_var = 'Yes';
    else new_var = 'No';
run;

The following examples show how to use each method with the following dataset in SAS:

/*create dataset*/
data original_data;
    input team $1-10;
    datalines;
Warriors
Wizards
Rockets
Celtics
Thunder
;
run;

/*view dataset*/
proc print data=original_data;

Example 1: Extract First N Characters from String

The following code shows how to extract the first 4 characters from the team variable:

/*create new dataset*/
data new_data;
    set original_data;
    first_four = substr(team, 1, 4);
run;

/*view new dataset*/
proc print data=new_data;

Notice that the first_four variable contains the first four characters of the team variable.

Example 2: Extract Characters in Specific Position Range from String

The following code shows how to extract the characters in positions 2 through 5 from the team variable:

/*create new dataset*/
data new_data;
    set original_data;
    two_through_five = substr(team, 2, 4);
run;

/*view new dataset*/
proc print data=new_data;

Example 3: Extract Last N Characters from String

The following code shows how to extract the last 3 characters from the team variable:

/*create new dataset*/
data new_data;
    set original_data;
    last_three = substr(team, length(team)-2, 3);
run;

/*view new dataset*/
proc print data=new_data;

Example 4: Create New Variable if Characters Exist in String

The following code shows how to create a new variable called W_Team that takes a value of ‘yes‘ if the first character in the team name is ‘W’ or a value of ‘no‘ if the first characters is not a ‘W.’

/*create new dataset*/
data new_data;
    set original_data;
    if substr(team, 1, 1) = 'W' then W_Team = 'Yes';
    else W_Team = 'No';
run;

/*view new dataset*/
proc print data=new_data;

Additional Resources

The following tutorials explain how to perform other common tasks in SAS:

How to Normalize Data in SAS
How to Replace Characters in a String in SAS
How to Replace Missing Values with Zero in SAS
How to Remove Duplicates in SAS

Leave a Reply

Your email address will not be published. Required fields are marked *