PySpark: How to Show Full Column Content


You can use the following methods to force a PySpark DataFrame to show the full content of each column, regardless of width:

Method 1: Use truncate=False

df.show(truncate=False) 

Method 2: Use truncate=0

df.show(truncate=0)

The following examples show how to use each method in practice with the following PySpark DataFrame:

from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()

#define data
data = [['A', 'Andy Bob Chad Doug Eric', 136], 
        ['B', 'Frank Henry', 223], 
        ['C', 'Ian John Ken Liam Mike Noah', 450], 
        ['D', 'Oscar Prim', 290], 
        ['E', 'Quentin Ross Sarah', 189]]
  
#define column names
columns = ['store', 'employees', 'sales'] 
  
#create dataframe using data and column names
df = spark.createDataFrame(data, columns) 
  
#view dataframe
df.show()

+-----+--------------------+-----+
|store|           employees|sales|
+-----+--------------------+-----+
|    A|Andy Bob Chad Dou...|  136|
|    B|         Frank Henry|  223|
|    C|Ian John Ken Liam...|  450|
|    D|          Oscar Prim|  290|
|    E|  Quentin Ross Sarah|  189|
+-----+--------------------+-----+

Notice that some of the rows in the employees column are cut off because they exceed the default width in PySpark, which is 20 characters.

Example 1: Show Full Column Content Using truncate=False

We can use the truncate=False argument to show the full content of each content in the PySpark DataFrame:

#view dataframe with full column content
df.show(truncate=False)

+-----+---------------------------+-----+
|store|employees                  |sales|
+-----+---------------------------+-----+
|A    |Andy Bob Chad Doug Eric    |136  |
|B    |Frank Henry                |223  |
|C    |Ian John Ken Liam Mike Noah|450  |
|D    |Oscar Prim                 |290  |
|E    |Quentin Ross Sarah         |189  |
+-----+---------------------------+-----+

Notice that we can now see the full content of the employees column.

Example 2: Show Full Column Content Using truncate=0

We can also use the truncate=0 argument to show the full content of each content in the PySpark DataFrame:

#view dataframe with full column content
df.show(truncate=0)

+-----+---------------------------+-----+
|store|employees                  |sales|
+-----+---------------------------+-----+
|A    |Andy Bob Chad Doug Eric    |136  |
|B    |Frank Henry                |223  |
|C    |Ian John Ken Liam Mike Noah|450  |
|D    |Oscar Prim                 |290  |
|E    |Quentin Ross Sarah         |189  |
+-----+---------------------------+-----+

Once again, we can now see the full content of the employees column.

Note: You can find the complete documentation for the PySpark show function here.

Additional Resources

The following tutorials explain how to perform other common tasks in PySpark:

PySpark: How to Find Unique Values in a Column
PySpark: How to Print One Column of DataFrame
PySpark: How to Get Last Row from DataFrame

Featured Posts

Leave a Reply

Your email address will not be published. Required fields are marked *