R: How to Split Number into Digits


Often you may want to split numbers into individual digits in a data frame in R.

The most efficient way to do so is by using the strsplit() function combined with the gsub() function with the following syntax:

cbind(do.call(rbind, strsplit(gsub('(.)(.)(.)(.)',
                                   '\\1 \\2 \\3 \\4',
                                   paste(df[,1])),' ')),
                                   df[, -1])

This particular example splits the numbers in the first column of the data frame named df into four new columns in which each column contains one digit from the original number in the first column.

Here is what the various functions do:

  • cbind: Binds columns together
  • do.call: Applies a function across multiple columns
  • rbind: Binds rows together
  • strsplit: Splits a string
  • gsub: Substitutes one pattern for another pattern
  • paste: concatenates strings together

By using each of these functions together we are able to split a number into individual digits using any specific pattern that we would like.

The following example shows how to use this syntax in practice.

Example: How to Split Number into Digits in R

Suppose we have the following data frame that contains the ID numbers for various employees at some company:

#create data frame
df <- data.frame(ID=c(1004, 2945, 3482, 7750, 9284, 1027, 3399))

#view data frame
df

    ID
1 1004
2 2945
3 3482
4 7750
5 9284
6 1027
7 3399

The data frame contains seven total employee ID numbers.

We can see that each ID number contains exactly four digits. Suppose that we would like to split each of these numbers into four new columns in which each column contains one digit from the original number.

We can use the following syntax to do so:

#split numbers into four columns
cbind(do.call(rbind, strsplit(gsub('(.)(.)(.)(.)',
                                   '\\1 \\2 \\3 \\4',
                                   paste(df[,1])),' ')),
                                   df[, -1])

  1 2 3 4
1 1 0 0 4
2 2 9 4 5
3 3 4 8 2
4 7 7 5 0
5 9 2 8 4
6 1 0 2 7
7 3 3 9 9

This returns a data frame with four columns in which each column contains one digit from the original employee ID number.

For example, the first ID number of 1004 has been split into individual digits across four columns.

Note that we use the notation (.) within the gsub() function to specify how many digits to place in each column of the resulting data frame.

If we’d like, we could place multiple digits into a new column.

For example, we could use the following syntax to place the first two digits of each employee ID number into one column and then place the last two digits into a second column:

#split numbers into two columns
cbind(do.call(rbind, strsplit(gsub('(..)(..)',
                                   '\\1 \\2',
                                   paste(df[,1])),' ')),
                                   df[, -1])

   1  2
1 10 04
2 29 45
3 34 82
4 77 50
5 92 84
6 10 27
7 33 99

This returns a data frame with two columns in which each column contains two digits from the original employee ID number.

For example, the first employee ID number of 1004 has been split into two columns in which the first two digits belong to the first column and the next two digits belong to the second column.

Note that you can use similar notation to split a number into however many columns you would like.

Also note that each column is not required to have the same number of digits in it.

Additional Resources

The following tutorials explain how to perform other common tasks in R:

How to Filter a Vector in R
How to Split a Vector into Chunks in R
How to Remove NA Values from Vector in R
How to Remove Specific Elements from Vector in R

Featured Posts

Leave a Reply

Your email address will not be published. Required fields are marked *