How to Calculate Standard Deviation in Pandas (With Examples)


You can use the DataFrame.std() function to calculate the standard deviation of values in a pandas DataFrame.

You can use the following methods to calculate the standard deviation in practice:

Method 1: Calculate Standard Deviation of One Column

df['column_name'].std() 

Method 2: Calculate Standard Deviation of Multiple Columns

df[['column_name1', 'column_name2']].std() 

Method 3: Calculate Standard Deviation of All Numeric Columns

df.std() 

Note that the std() function will automatically ignore any NaN values in the DataFrame when calculating the standard deviation.

The following examples shows how to use each method with the following pandas DataFrame:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'B', 'B', 'B', 'B', 'C', 'C'],
                   'points': [25, 12, 15, 14, 19, 23, 25, 29],
                   'assists': [5, 7, 7, 9, 12, 9, 9, 4],
                   'rebounds': [11, 8, 10, 6, 6, 5, 9, 12]})

#view DataFrame
print(df)

	team	points	assists	rebounds
0	A	25	5	11
1	A	12	7	8
2	B	15	7	10
3	B	14	9	6
4	B	19	12	6
5	B	23	9	5
6	C	25	9	9
7	C	29	4	12

Method 1: Calculate Standard Deviation of One Column

The following code shows how to calculate the standard deviation of one column in the DataFrame:

#calculate standard deviation of 'points' column
df['points'].std() 

6.158617655657106

The standard deviation turns out to be 6.1586.

Method 2: Calculate Standard Deviation of Multiple Columns

The following code shows how to calculate the standard deviation of multiple columns in the DataFrame:

#calculate standard deviation of 'points' and 'rebounds' columns
df[['points', 'rebounds']].std()

points      6.158618
rebounds    2.559994
dtype: float64

The standard deviation of the ‘points’ column is 6.1586 and the standard deviation of the ‘rebounds’ column is 2.5599.

Method 3: Calculate Standard Deviation of All Numeric Columns

The following code shows how to calculate the standard deviation of every numeric column in the DataFrame:

#calculate standard deviation of all numeric columns
df.std()

points      6.158618
assists     2.549510
rebounds    2.559994
dtype: float64

Notice that pandas did not calculate the standard deviation of the ‘team’ column since it was not a numeric column.

Additional Resources

The following tutorials explain how to perform other common operations in pandas:

How to Calculate the Mean of Columns in Pandas
How to Calculate the Median of Columns in Pandas
How to Calculate the Max Value of Columns in Pandas

Leave a Reply

Your email address will not be published.