The Ultimate Guide: How to Read CSV Files with Pandas


CSV (comma-separated value) files are one of the most common ways to store data. Fortunately the pandas function read_csv() allows you to easily read in CSV files into Python in almost any format you’d like.

This tutorial explains several ways to read CSV files into Python using the following CSV file named ‘data.csv’:

playerID,team,points
1,Lakers,26
2,Mavs,19
3,Bucks,24
4,Spurs,22

Example 1: Read CSV File into pandas DataFrame

The following code shows how to read the CSV file into a pandas DataFrame:

#import CSV file as DataFrame
df = pd.read_csv('data.csv')

#view DataFrame
df

        playerID  team	  points
0	1	  Lakers  26
1	2	  Mavs	  19
2	3	  Bucks	  24
3	4	  Spurs	  22

Example 2: Read Specific Columns from CSV File

The following code shows how to read only the columns titled ‘playerID’ and ‘points’ in the CSV file into a pandas DataFrame:

#import only specific columns from CSV file
df = pd.read_csv('data.csv', usecols=['playerID', 'points'])

#view DataFrame
df

	playerID  points
0	1	  26
1	2	  19
2	3	  24
3	4	  22

Alternatively you can specify column indices to read into a pandas DataFrame:

#import only specific columns from CSV file
df = pd.read_csv('data.csv', usecols=[0, 1])

#view DataFrame
df

        playerID  team
0	1	  Lakers
1	2	  Mavs
2	3	  Bucks
3	4	  Spurs

Example 3: Specify Header Row when Importing CSV File

In some cases, the header row might not be the first row in a CSV file. For example, consider the following CSV file in which the header row actually appears in the second row:

random,data,values
playerID,team,points
1,Lakers,26
2,Mavs,19
3,Bucks,24
4,Spurs,22

To read this CSV file into a pandas DataFrame, we can specify header=1 as follows:

#import from CSV file and specify that header starts on second row
df = pd.read_csv('data.csv', header=1)

#view DataFrame
df

        playerID team	points
0	1	 Lakers	26
1	2	 Mavs	19
2	3	 Bucks	24
3	4	 Spurs	22

Example 4: Skip Rows when Importing CSV File

You can also easily skip rows when importing a CSV file by using the skiprows argument. For example, the following code shows how to skip the second row when importing the CSV file:

#import from CSV file and skip second row
df = pd.read_csv('data.csv', skiprows=[1])

#view DataFrame
df

        playerID team	points
0	2	 Mavs	19
1	3	 Bucks	24
2	4	 Spurs	22

And the following code shows how to skip the second and third row when importing the CSV file:

#import from CSV file and skip second and third rows
df = pd.read_csv('data.csv', skiprows=[1, 2])

#view DataFrame
df

        playerID team	points
1	3	 Bucks	24
2	4	 Spurs	22

Example 4: Read CSV Files with Custom Delimiter

Occasionally you may have a CSV file with a delimiter that is different from a comma. For example, suppose our CSV file has an underscore as a delimiter:

playerID_team_points
1_Lakers_26
2_Mavs_19
3_Bucks_24
4_Spurs_22

To read this CSV file into pandas, we can use the sep argument to specify the delimiter to use when reading the file:

#import from CSV file and specify delimiter to use
df = pd.read_csv('data.csv', sep='_')

#view DataFrame
df

	playerID team	points
0	1	 Lakers	26
1	2	 Mavs	19
2	3	 Bucks	24
3	4	 Spurs	22

You can find the complete documentation for the pandas read_csv() function here.

Leave a Reply

Your email address will not be published. Required fields are marked *