Pandas: How to Use First Column as Index


You can use the following methods to use the first column as the index column in a pandas DataFrame:

Method 1: Use First Column as Index When Importing DataFrame

df = pd.read_csv('my_data.csv', index_col=0)

Method 2: Use First Column as Index with Existing DataFrame

df = df.set_index(['column1'])

The following examples show how to use each method in practice.

Example 1: Use First Column as Index When Importing DataFrame

Suppose we have the following CSV file called my_data.csv:

If we import the CSV file without specifying an index column, pandas will simply create an index column with numerical values starting at 0:

#import CSV file without specifying index column
df = pd.read_csv('my_data.csv')

#view DataFrame
print(df)

  team  points  assists
0    A      18        5
1    B      22        7
2    C      19        7
3    D      14        9
4    E      14       12
5    F      11        9
6    G      20        9
7    H      28        4

However, we can use the index_col argument to specify that the first column in the CSV file should be used as the index column:

#import CSV file and specify index column
df = pd.read_csv('my_data.csv', index_col=0)

#view DataFrame
print(df)

      points  assists
team                 
A         18        5
B         22        7
C         19        7
D         14        9
E         14       12
F         11        9
G         20        9
H         28        4

Notice that the team column is now used as the index column.

Example 2: Use First Column as Index with Existing DataFrame

Suppose we have the following existing pandas DataFrame:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'],
                   'points': [18, 22, 19, 14, 14, 11, 20, 28],
                   'assists': [5, 7, 7, 9, 12, 9, 9, 4]})

#view DataFrame
df

  team  points  assists
0    A      18        5
1    B      22        7
2    C      19        7
3    D      14        9
4    E      14       12
5    F      11        9
6    G      20        9
7    H      28        4

We can use the set_index() function to set the team column as the index column:

#set 'team' column as index column
df = df.set_index(['team'])

#view updated DataFrame
print(df)

      points  assists
team                 
A         18        5
B         22        7
C         19        7
D         14        9
E         14       12
F         11        9
G         20        9
H         28        4

Notice that the team column is now used as the index column.

Additional Resources

The following tutorials explain how to perform other common tasks in pandas:

How to Select Columns by Index in a Pandas DataFrame
How to Rename Index in Pandas DataFrame
How to Drop Columns by Index in Pandas

Leave a Reply

Your email address will not be published.