Often you may want to return all rows that occur in both of two data frames in R.

Fortunately this is easy to do by using the **intersect()** function from the **dplyr** package in R, which is designed to perform this exact task.

The **intersect****()** function uses the following basic syntax:

**intersect(x, y)**

where:

**x**: The name of the first data frame**y**: The name of the second data frame

Note that this function returns a data frame as a result.

Also note that the opposite of this function is the **union()** function, which uses the same syntax and will return all rows that occur in *either* data frame.

The following example shows how to use the **intersect****()** function from the **dplyr** package in practice.

**Note**: Before using the **intersect****()** function, you may need to first install the **dplyr** package by using the following syntax:

install.packages('dplyr')

Once the **dplyr** package is installed, you can use the **intersect****()** function.

**Example: How to Use the intersect() Function in dplyr**

Suppose we create the following two data frames named **df1** and **df2**:

#create first data frame df1 <- data.frame(team=c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'), points=c(14, 14, 19, 25, 40, 34, 38, 17)) df1 team points 1 A 14 2 A 14 3 A 19 4 A 25 5 B 40 6 B 34 7 B 38 8 B 17 #create second data frame df2 <- data.frame(team=c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'), points=c(14, 10, 11, 15, 10, 32, 38, 27)) df2 team points 1 A 14 2 A 10 3 A 11 4 A 15 5 B 10 6 B 32 7 B 38 8 B 27

Suppose that we would like to return a single data frame that contains all rows that occur in both data frames.

We can use the **intersect()** function from the **dplyr** package to do so:

library(dplyr) #return all rows that occur in both data frames df_all <- intersect(df1, df2) #view resulting data frame df_all team points 1 A 14 2 B 38

Notice that the new data frame named **df_all** contains all rows that occur in both data frames.

From the output we can see that only two rows occur in both data frames.

If you would simply like to know the number of rows that occur in both data frames, then you can wrap the **nrow()** function around the **intersect()** function to return the number of resulting rows.

Note that the **nrow()** function is used to return the number of rows in a given data frame.

We can use the following syntax to return the number of rows that occur in both data frames:

library(dplyr) #return number of rows that occur in both data frames df_all_num <- nrow(intersect(df1, df2)) #view results df_all_num [1] 2

This returns a value of **2**, which tells us that there are two rows that occur in both **df1** and **df2**. This matches the result from the previous example.

Note that if the **nrow()** function returned a value of **0** then it would tell us that the two data frames do not share any rows in common.

**Note**: You can find the complete documentation for the **intersect()** function from the **dplyr** package here.

