How to Use the info() Method in Pandas


Often you may want to print a summary of a pandas DataFrame.

One of the most common ways to do so is by using the info() method, which uses the following syntax:

DataFrame.info(verbose=None, buf=None, max_col=None, memory_usage=None, show_counts=None)

where:

  • verbose: Whether to print the full summary
  • buf: Where to send the output
  • max_col: When to switch from verbose to truncated output
  • memory_usage: Whether total memory usage of the DataFrame elements should be displayed
  • show_counts: Whether to show non-null counts

By using this single info() method, we are able to gain a good understanding of each column in a pandas DataFrame.

The following example shows how to use the info() method in practice with a pandas DataFrame.

Example: How to Use the info() Method in Pandas

Suppose we create the following pandas DataFrame that contains information about various basketball players:

import pandas as pd
import numpy as np

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'B', 'B', 'C', 'C', 'C', 'D'],
                   'points': [12, 14, 18, 13, np.nan, np.nan, 20, 29],
                   'assists': [10, 22, 24, 20, 14, 18, 10, 12]})

#view DataFrame
print(df)

  team  points  assists
0    A    12.0       10
1    A    14.0       22
2    B    18.0       24
3    B    13.0       20
4    C     NaN       14
5    C     NaN       18
6    C    20.0       10
7    D    29.0       12

Suppose that we would like to generate a summary of each column in this particular DataFrame.

We can use the info() method to do so:

#print summary of DataFrame
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8 entries, 0 to 7
Data columns (total 3 columns):
 #   Column   Non-Null Count  Dtype  
---  ------   --------------  -----  
 0   team     8 non-null      object 
 1   points   6 non-null      float64
 2   assists  8 non-null      int64  
dtypes: float64(1), int64(1), object(1)
memory usage: 324.0+ bytes

The output displays a variety of information that summarizes the DataFrame.

Here is how to interpret each line in the output:

The first line shows the class of the object, which is a pandas DataFrame.

The second line displays the range of the index column, which we can see has 8 total entries that range from 0 to 7.

The next portion of the output shows the index number, column name, non-null element counts and dtype of the each column.

For example, we can see:

  • The team column is an object, i.e. a string column.
  • The points column is a floating point number column.
  • The assists column is an integer column.

The last line in the output displays the total memory usage of the DataFrame elements.

Note that we could set the show_counts argument and the memory_usage arguments to False if we would like to avoid showing the total non-null counts in each column along with the total memory usage of the DataFrame elements, which isn’t always of interest:

#print summary of DatFrame with less info
df.info(show_counts=False, memory_usage=False)

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8 entries, 0 to 7
Data columns (total 3 columns):
 #   Column   Dtype  
---  ------   -----  
 0   team     object 
 1   points   float64
 2   assists  int64  
dtypes: float64(1), int64(1), object(1)

Notice that the output no longer shows the non-null count of elements in each column in the DataFrame and it no longer displays the total memory usage.

Note: You can find the complete documentation for the info() method in pandas here.

Additional Resources

The following tutorials explain how to perform other common tasks in pandas:

How to Use qcut() in Pandas
How to Use pct_change() in Pandas
How to Use the map() Function in Pandas

Featured Posts

Leave a Reply

Your email address will not be published. Required fields are marked *