You can use the **cut_number()** function from the **ggplot2** package in R to split a vector into equal sized groups.

This function uses the following basic syntax:

**cut_number(x, n)**

where:

**x:**Name of numeric vector to split**n:**Number of groups

The following example shows how to use this function in practice.

**Example: How to Split Data into Equal Sized Groups in R**

Suppose we have the following data frame in R that contains information about the points scored by 12 different basketball players

#create data frame df <- data.frame(player=LETTERS[1:12], points=c(1, 2, 2, 2, 4, 5, 7, 9, 12, 14, 15, 22)) #view data frame df player points 1 A 1 2 B 2 3 C 2 4 D 2 5 E 4 6 F 5 7 G 7 8 H 9 9 I 12 10 J 14 11 K 15 12 L 22

**Related:** How to Use LETTERS Function in R

We can use the **cut_number()** function from the **ggplot2** package to create a new column called **group** that splits each row in the data frame into one of three groups based on the value in the **points** column:

library(ggplot2) #create new column that splits data into three equal sized groups based on points df$group <- cut_number(df$points, 3) #view updated data frame df player points group 1 A 1 [1,3.33] 2 B 2 [1,3.33] 3 C 2 [1,3.33] 4 D 2 [1,3.33] 5 E 4 (3.33,10] 6 F 5 (3.33,10] 7 G 7 (3.33,10] 8 H 9 (3.33,10] 9 I 12 (10,22] 10 J 14 (10,22] 11 K 15 (10,22] 12 L 22 (10,22]

Each of the 12 players have been placed into one of three groups based on the value in the **points** column.

From the output we can see that there are 3 distinct groups:

- group 1: points value is between 1 and 3.33.
- group 2: points value is between 3.33 and 10.
- group 3: points value is between 10 and 22.

We can see that four players have been placed into each group.

If you would like the **group** column to display the groups as integer values instead, you can wrap the **cut_number()** function in an **as.numeric()** function:

library(ggplot2) #create new column that splits data into three equal sized groups based on points df$group <- as.numeric(cut_number(df$points, 3)) #view updated data frame df player points group 1 A 1 1 2 B 2 1 3 C 2 1 4 D 2 1 5 E 4 2 6 F 5 2 7 G 7 2 8 H 9 2 9 I 12 3 10 J 14 3 11 K 15 3 12 L 22 3

The new group column now contains the values 1, 2 and 3 to indicate which group the player belongs to.

Once again, each group contains four players.

**Note**: To split the points column into more than three groups, simply change the **3** in the **cut_number()** function to a different number.

**Additional Resources**

The following tutorials explain how to perform other common tasks in R:

How to Split a Data Frame in R

How to Split Data into Training & Test Sets in R

How to Perform Data Binning in R