The **setdiff()** function in R can be used to find differences between two sets. This function uses the following syntax:

**setdiff(x, y)**

where:

**x, y:**Vectors or data frames containing a sequence of items

This tutorial provides several examples of how to use this function in practice.

**Example 1: Setdiff with Numeric Vectors**

The following code shows how to use **setdiff()** to identify all of the values in vector *a* that do not occur in vector *b*:

#define vectors a <- c(1, 3, 4, 5, 9, 10) b <- c(1, 2, 3, 4, 5, 6) #find all values inathat do not occur inbsetdiff(a, b) [1] 9 10

There are two values that occur in vector *a* that do not occur in vector *b*: **9** and **10**.

If we reverse the order of the vectors in the **setdiff()** function, we can instead identify all of the values in vector *b* that do not occur in vector *a*:

#find all values inbthat do not occur inasetdiff(b, a) [1] 2 6

There are two values that occur in vector *b* that do not occur in vector *a*: **2** and **6**.

**Example 2: Setdiff with Character Vectors**

The following code shows how to use **setdiff()** to identify all of the values in vector *char1* that do not occur in vector *char2*:

#define character vectors char1 <- c('A', 'B', 'C', 'D', 'E') char2 <- c('A', 'B', 'E', 'F', 'G') #find all values inchar1that do not occur inchar2setdiff(char1, char2) [1] "C" "D"

**Example 3: Setdiff with Data Frames**

The following code shows how to use **setdiff()** to identify all of the values in one data frame column that do not appear in the same column of a second data frame:

#define data frames df1 <- data.frame(team=c('A', 'B', 'C', 'D'), conference=c('West', 'West', 'East', 'East'), points=c(88, 97, 94, 104)) df2 <- data.frame(team=c('A', 'B', 'C', 'D'), conference=c('West', 'West', 'East', 'East'), points=c(88, 97, 98, 99)) #find differences between the points columns in the two data frames setdiff(df1$points, df2$points) [1] 94 104

We can see that the values **94** and **104** occur in the points column of the first data frame, but not in the points column of the second data frame.

