# dplyr: How to Filter Based on Factor

You can use the following methods in dplyr to filter the rows of a data frame in R based on a factor variable:

Method 1: Filter Based on Factor Labels

```library(dplyr)

#filter rows where team column is equal to factor label 'A' or 'C'
df %>%
filter(team %in% c('A', 'C'))
```

Method 2: Filter Based on Factor Levels

```library(dplyr)

#filter rows where factor level of team column is greater than 2
df %>%
filter(as.integer(team)>2)```

The following examples shows how to use each method in practice with the following data frame in R that contains information about various basketball players:

```#create data frame
df <- data.frame(team=as.factor(c('A', 'A', 'A', 'B', 'B', 'C', 'C', 'D')),
points=c(12, 34, 20, 25, 22, 28, 34, 19))

#view data frame
df

team points
1    A     12
2    A     34
3    A     20
4    B     25
5    B     22
6    C     28
7    C     34
8    D     19
```

## Example 1: Filter Based on Factor Labels

We can use the following syntax to filter the data frame to only contain rows where the factor labels of the team column are equal to A or C:

```library(dplyr)

#filter rows where team column is equal to factor label 'A' or 'C'
df %>%
filter(team %in% c('A', 'C'))

team points
1    A     12
2    A     34
3    A     20
4    C     28
5    C     34
```

Notice that the resulting data frame only contains rows where the value in the team column is equal to either A or C.

## Example 2: Filter Based on Factor Levels

We can use the following syntax to filter the data frame to only contain rows where the factor levels of the team column are greater than 2:

```library(dplyr)

#filter rows where factor level of team column is greater than 2
df %>%
filter(as.integer(team)>2)

team points
1    C     28
2    C     34
3    D     19
```

In this particular example, the as.integer function converts the factor labels of the team column to integers.

For example:

• Factor level ‘A’ becomes 1.
• Factor level ‘B’ becomes 2.
• Factor level ‘C’ becomes 3.
• Factor level ‘D’ becomes 4.

Thus, when we filter for rows where the factor level is greater than 2, only the rows with a value of C or D in the team column are kept.