You can use the following syntax to sort the rows in a pivot table in PySpark based on values in a specific column:

df_pivot.orderBy('my_column').show()

This particular example sorts the rows in the pivot table called **df_pivot** based on the values in the column named **my_column**.

The following example shows how to use this syntax in practice.

**Example: How to Sort a Pivot Table in PySpark**

Suppose we have the following PySpark DataFrame that contains information about the points scored by various basketball players:

**from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
#define data
data = [['A', 'Guard', 14],
['A', 'Guard', 4],
['A', 'Forward', 16],
['A', 'Forward', 18],
['B', 'Guard', 9],
['B', 'Forward', 5],
['B', 'Forward', 25],
['C', 'Forward', 12],
['C', 'Guard', 14],
['C', 'Guard', 23]]
#define column names
columns = ['team', 'position', 'points']
#create dataframe using data and column names
df = spark.createDataFrame(data, columns)
#view dataframe
df.show()
+----+--------+------+
|team|position|points|
+----+--------+------+
| A| Guard| 14|
| A| Guard| 4|
| A| Forward| 16|
| A| Forward| 18|
| B| Guard| 9|
| B| Forward| 5|
| B| Forward| 25|
| C| Forward| 12|
| C| Guard| 14|
| C| Guard| 23|
+----+--------+------+
**

We can use the following syntax to create a pivot table using **team** as the rows, **position** as the columns and the sum of **points** as the values within the pivot table:

#create pivot table that shows sum of points by team and position df_pivot = df.groupBy('team').pivot('position').sum('points') #view pivot table df_pivot.show() +----+-------+-----+ |team|Forward|Guard| +----+-------+-----+ | B| 30| 9| | C| 12| 37| | A| 34| 18| +----+-------+-----+

The resulting pivot table shows the sum of the points values for each team and position.

We can use the following syntax to sort the rows of the pivot table in ascending order based on the values in the **Forward** column:

#sort rows of pivot table by values in 'Forward' column in ascending order df_pivot.orderBy('Forward').show() +----+-------+-----+ |team|Forward|Guard| +----+-------+-----+ | C| 12| 37| | B| 30| 9| | A| 34| 18| +----+-------+-----+

Notice that the rows in the pivot table are now sorted in ascending order based on the values in the **Forward** column.

If you would instead like to sort the rows in descending order, you can use the argument **ascending=False** as follows:

#sort rows of pivot table by values in 'Forward' column in descending order df_pivot.orderBy('Forward', ascending=False).show() +----+-------+-----+ |team|Forward|Guard| +----+-------+-----+ | A| 34| 18| | B| 30| 9| | C| 12| 37| +----+-------+-----+

The rows in the pivot table are now sorted in descending order based on the values in the **Forward** column.

**Note**: You can find the complete documentation for the PySpark **orderBy** function here.

**Additional Resources**

The following tutorials explain how to perform other common tasks in PySpark:

How to Create a Pivot Table in PySpark

How to Unpivot a PySpark DataFrame

How to Order by Multiple Columns in PySpark