How to Remove Special Characters from Strings in SAS


The easiest way to remove special characters from a string in SAS is to use the COMPRESS function with the ‘kas’ modifier.

This function uses the following basic syntax:

data new_data;
    set original_data;
    remove_specials = compress(some_string, , 'kas');
run;

The following example shows how to use this syntax in practice.

Example: Remove Special Characters from String in SAS

Suppose we have the following dataset in SAS that contains the names of various employees and their total sales:

/*create dataset*/
data data1;
    input name $ sales;
    datalines;
Bob&%^ 45
M&$#@ike 50
Randy)) 39
Chad!? 14
Dan** 29
R[on] 44
;
run;

/*view dataset*/
proc print data=data1;

Notice that the values in the name column contain several special characters.

We can use the COMPRESS function to remove these special characters:

/*create second dataset with special characters removed from names*/
data data2;
  set data1;
  new_name=compress(name, , 'kas');
run;

/*view dataset*/
proc print data=data2;

Notice that the new_name column contains the values in the name column with the special characters removed.

Here’s exactly what the COMPRESS function did to remove these special characters:

  • k specifies that we would like to ‘keep’ certain characters
  • a specifies to keep alphabetic characters
  • s specifies to keep space characters

Note: You can find a complete list of modifiers for the COMPRESS function on this SAS documentation page.

Additional Resources

The following tutorials explain how to perform other common tasks in SAS:

How to Extract Numbers from String in SAS
How to Use the SUBSTR Function in SAS
How to Convert Strings to Uppercase, Lowercase & Proper Case in SAS

Leave a Reply

Your email address will not be published.