How to Use the get_loc() Function in Pandas


Often you may want to get the integer location of a particular label in a pandas DataFrame.

The easiest way to do so is by using the get_loc() function, which uses the following syntax:

pandas.Index.get_loc(key)

where:

  • key: The label whose integer location you would like to retrieve

In most cases you will use this function to retrieve the integer location of a particular column name in a pandas DataFrame but it can also be used to retrieve the integer location of a particular label based on a specific row value as well.

The following example shows how to use the get_loc() function in practice with a pandas DataFrame.

Example: How to Use the get_loc() Function in Pandas

Suppose we create the following pandas DataFrame that contains information about various basketball players:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'B', 'B', 'C', 'C', 'C', 'D'],
                   'points': [12, 18, 18, 22, 30, 41, 12, 29],
                   'assists': [8, 10, 11, 11, 7, 12, 8, 5],
                   'rebounds': [10, 9, 18, 20, 13, 10, 7, 3]})

#view DataFrame
print(df)

  team  points  assists  rebounds
0    A      12        8        10
1    A      18       10         9
2    B      18       11        18
3    B      22       11        20
4    C      30        7        13
5    C      41       12        10
6    C      12        8         7
7    D      29        5         3

Suppose that we would like to retrieve the integer location of the column named assists.

We can use the following syntax with the get_loc() function to do so:

#retrieve integer location of the column named 'assists'
df.columns.get_loc('assists')

2

The output displays a value of 2, which tells us that the column in index position 2 contains the label ‘assists’ as the name.

Note: The first column has an index value of 0.

Note that we used df.columns to retrieve a list of all column names in the DataFrame.

The following syntax shows how to use this function without the get_loc() function:

#print list of columns in DataFrame
df.columns

Index(['team', 'points', 'assists', 'rebounds'], dtype='object')

The output displays a list of all column names.

Also note that we can combine the get_loc() function with the iloc() function to return an individual element in the DataFrame based on a row index number and column index number.

The following example shows how to do so in practice:

#retrieve element in row index 4 and integer position of the 'assists' column
df.iloc[4, df.columns.get_loc('assists')]
7

This returns a value of 7, which represents the value in the DataFrame for row index 4 (i.e. the 5th row in the DataFrame) and for the integer of the column represented by ‘assists’ as the label.

Note that the benefit of using get_loc() in this scenario is if we didn’t know ahead of time which integer location the ‘assists’ column had.

For this particular example, we used a DataFrame with only four columns for simplicity.

In practice, the get_loc() function is particularly helpful when you have a DataFrame with hundreds or even thousands of column names and you want to work with a specific column name but aren’t able to easily identify the integer position it has as an index value.

Note: You can find the complete documentation for the get_loc() function in pandas here.

Additional Resources

The following tutorials explain how to perform other common tasks in pandas:

How to Use pct_change() in Pandas
Pandas: How to Use isin() for Multiple Columns
How to Create a Tuple from Two Columns in Pandas

Featured Posts

Leave a Reply

Your email address will not be published. Required fields are marked *