You can use the following methods to check if a column of a PySpark DataFrame contains a string:

**Method 1: Check if Exact String Exists in Column**

#check if 'conference' column contains exact string 'Eas' in any row df.where(df.conference=='Eas').count()>0

**Method 2: Check if Partial String Exists in Column**

#check if 'conference' column contains partial string 'Eas' in any row df.filter(df.conference.contains('Eas')).count()>0

**Method 3: Count Occurrences of Partial String ****in Column**

#count occurrences of partial string 'Eas' in 'conference' column df.filter(df.conference.contains('Eas')).count()

The following examples show how to use each method in practice with the following PySpark DataFrame:

**from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
#define data
data = [['A', 'East', 11],
['A', 'East', 8],
['A', 'East', 10],
['B', 'West', 6],
['B', 'West', 6],
['C', 'East', 5]]
#define column names
columns = ['team', 'conference', 'points']
#create dataframe using data and column names
df = spark.createDataFrame(data, columns)
#view dataframe
df.show()
+----+----------+------+
|team|conference|points|
+----+----------+------+
| A| East| 11|
| A| East| 8|
| A| East| 10|
| B| West| 6|
| B| West| 6|
| C| East| 5|
+----+----------+------+**

**Example 1: Check if Exact String Exists in Column**

The following code shows how to check if the exact string ‘Eas’ exists in the **conference** column of the DataFrame:

#check if 'conference' column contains exact string 'Eas' in any row df.where(df.conference=='Eas').count()>0 False

The output returns **False**, which tells us that the exact string ‘Eas’ does not exist in the **conference** column of the DataFrame.

**Example 2: Check if Partial String Exists in Column**

The following code shows how to check if the partial string ‘Eas’ exists in the **conference** column of the DataFrame:

#check if 'conference' column contains partial string 'Eas' in any row df.filter(df.conference.contains('Eas')).count()>0 True

The output returns **True**, which tells us that the partial string ‘Eas’ does exist in the **conference** column of the DataFrame.

**Example 3: Count Occurrences of Partial String in Column**

The following code shows how to count the number of times the partial string ‘Eas’ occurs in the **conference** column of the DataFrame:

#count occurrences of partial string 'Eas' in 'conference' column df.filter(df.conference.contains('Eas')).count() 4

The output returns **4**, which tells us that the partial string ‘Eas’ occurs 4 times in the **conference** column of the DataFrame.

**Additional Resources**

The following tutorials explain how to perform other common tasks in PySpark:

PySpark: How to Select Rows Based on Column Values

PySpark: How to Select Columns by Index in DataFrame

PySpark: How to Select Rows by Index in DataFrame