You can use the following syntax with **fillna()** to replace null values in one column with corresponding values from another column in a PySpark DataFrame:

from pyspark.sql.functions import coalesce df.withColumn('points', coalesce('points', 'points_estimate')).show()

This particular example replaces null values in the **points** column with corresponding values from the **points_estimate** column.

The following example shows how to use this syntax in practice.

**Example: How to Use fillna() with Another Column in PySpark**

Suppose we have the following PySpark DataFrame that contains information about points scored by various basketball players:

**from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
#define data
data = [['Mavs', 18, 18],
['Nets', 33, 33],
['Lakers', None, 25],
['Kings', 15, 15],
['Hawks', None, 29],
['Wizards', None, 14],
['Magic', 28, 28]]
#define column names
columns = ['team', 'points', 'points_estimate']
#create dataframe using data and column names
df = spark.createDataFrame(data, columns)
#view dataframe
df.show()
+-------+------+---------------+
| team|points|points_estimate|
+-------+------+---------------+
| Mavs| 18| 18|
| Nets| 33| 33|
| Lakers| null| 25|
| Kings| 15| 15|
| Hawks| null| 29|
|Wizards| null| 14|
| Magic| 28| 28|
+-------+------+---------------+**

Suppose we would like to fill in all of the null values in the **points** column with corresponding values from the **points_estimate** column.

We can use the following syntax to do so:

from pyspark.sql.functions import coalesce #replace null values in 'points' column with values from 'points_estimate' column df.withColumn('points', coalesce('points', 'points_estimate')).show() +-------+------+---------------+ | team|points|points_estimate| +-------+------+---------------+ | Mavs| 18| 18| | Nets| 33| 33| | Lakers| 25| 25| | Kings| 15| 15| | Hawks| 29| 29| |Wizards| 14| 14| | Magic| 28| 28| +-------+------+---------------+

Notice that each of the null values in the **points** column have been replaced with the corresponding values from the **points_estimate** column.

**Note**: You can find the complete documentation for the PySpark **coalesce()** function here.

**Additional Resources**

The following tutorials explain how to perform other common tasks in PySpark:

PySpark: How to Use “OR” Operator

PySpark: How to Use “AND” Operator

PySpark: How to Use “NOT IN” Operator