How to Fix: ValueError: cannot convert float NaN to integer


One error you may encounter when using pandas is:

ValueError: cannot convert float NaN to integer

This error occurs when you attempt to convert a column in a pandas DataFrame from a float to an integer, yet the column contains NaN values.

The following example shows how to fix this error in practice.

How to Reproduce the Error

Suppose we create the following pandas DataFrame:

import pandas as pd
import numpy as np

#create DataFrame
df = pd.DataFrame({'points': [25, 12, 15, 14, 19, 23, 25, 29],
                   'assists': [5, 7, 7, 9, 12, 9, 9, 4],
                   'rebounds': [11, np.nan, 10, 6, 5, np.nan, 9, 12]})

#view DataFrame
df

        points	assists	 rebounds
0	25	5	 11
1	12	7	 NaN
2	15	7	 10
3	14	9	 6
4	19	12	 5
5	23	9	 NaN
6	25	9	 9
7	29	4	 12

Currently the ‘rebounds’ column is of the data type ‘float.’

#print data type of 'rebounds' column
df['rebounds'].dtype

dtype('float64')

Suppose we attempt to convert the ‘rebounds’ column from a float to an integer:

#attempt to convert 'rebounds' column from float to integer
df['rebounds'] = df['rebounds'].astype(int)

ValueError: cannot convert float NaN to integer 

We receive a ValueError because the NaN values in the ‘rebounds’ column cannot be converted to integer values.

How to Fix the Error

The way to fix this error is to deal with the NaN values before attempting to convert the column from a float to an integer.

We can use the following code to first identify the rows that contain NaN values:

#print rows in DataFrame that contain NaN in 'rebounds' column
print(df[df['rebounds'].isnull()])

   points  assists  rebounds
1      12        7       NaN
5      23        9       NaN

We can then either drop the rows with NaN values or replace the NaN values with some other value before converting the column from a float to an integer:

Method 1: Drop Rows with NaN Values

#drop all rows with NaN values
df = df.dropna()

#convert 'rebounds' column from float to integer
df['rebounds'] = df['rebounds'].astype(int) 

#view updated DataFrame
df
	points	assists	rebounds
0	25	5	11
2	15	7	10
3	14	9	6
4	19	12	5
6	25	9	9
7	29	4	12

#view class of 'rebounds' column
df['rebounds'].dtype

dtype('int64')

Method 2: Replace NaN Values

#replace all NaN values with zeros
df['rebounds'] = df['rebounds'].fillna(0)

#convert 'rebounds' column from float to integer
df['rebounds'] = df['rebounds'].astype(int) 

#view updated DataFrame
df

	points	assists	rebounds
0	25	5	11
1	12	7	0
2	15	7	10
3	14	9	6
4	19	12	5
5	23	9	0
6	25	9	9
7	29	4	12

#view class of 'rebounds' column
df['rebounds'].dtype

dtype('int64')

Note that both methods allow us to avoid the ValueError and successfully convert the float column to an integer column.

Additional Resources

The following tutorials explain how to fix other common errors in Python:

How to Fix: columns overlap but no suffix specified
How to Fix: ‘numpy.ndarray’ object has no attribute ‘append’
How to Fix: if using all scalar values, you must pass an index

Leave a Reply

Your email address will not be published.