How to Add a Numpy Array to a Pandas DataFrame


Occasionally you may want to add a NumPy array as a new column to a pandas DataFrame.

Fortunately you can easily do this using the following syntax:

df['new_column'] = array_name.tolist()

This tutorial shows a couple examples of how to use this syntax in practice.

Example 1: Add NumPy Array as New Column in DataFrame

The following code shows how to create a pandas DataFrame to hold some stats for basketball players and append a NumPy array as a new column titled ‘blocks’:

import numpy as np
import pandas as pd

#create pandas DataFrame
df = pd.DataFrame({'points': [25, 12, 15, 14, 19, 23, 25, 29],
                   'assists': [5, 7, 7, 9, 12, 9, 9, 4],
                   'rebounds': [11, 8, 10, 6, 6, 5, 9, 12]})

#create NumPy array for 'blocks'
blocks = np.array([2, 3, 1, 0, 2, 7, 8, 2])

#add 'blocks' array as new column in DataFrame
df['blocks'] = blocks.tolist()

#display the DataFrame
print(df)

   points  assists  rebounds  blocks
0      25        5        11       2
1      12        7         8       3
2      15        7        10       1
3      14        9         6       0
4      19       12         6       2
5      23        9         5       7
6      25        9         9       8
7      29        4        12       2

Note that the new DataFrame now has an extra column titled blocks.

Example 2: Add NumPy Matrix as New Columns in DataFrame

The following code shows how to create a pandas DataFrame to hold some stats for basketball players and append a NumPy array as a new column titled ‘blocks’:

import numpy as np
import pandas as pd

#create pandas DataFrame
df = pd.DataFrame({'points': [25, 12, 15, 14, 19, 23

#create NumPy matrix
mat = np.matrix([[2, 3],
                 [1, 0],
                 [2, 7],
                 [8, 2],
                 [3, 4],
                 [7, 7],
                 [7, 5],
                 [6, 3]])

#add NumPy matrix as new columns in DataFrame
df_new = pd.concat([df, pd.DataFrame(mat)], axis=1)

#display new DataFrame
print(df_new)

   points  assists  rebounds  0  1
0      25        5        11  2  3
1      12        7         8  1  0
2      15        7        10  2  7
3      14        9         6  8  2
4      19       12         6  3  4
5      23        9         5  7  7
6      25        9         9  7  5
7      29        4        12  6  3

Note that the names of the columns for the matrix that we added to the DataFrame are given default column names of 0 and 1.

We can easily rename these columns using the df.columns function:

#rename columns
df_new.columns = ['pts', 'ast', 'rebs', 'new1', 'new2']

#display DataFrame
print(df_new)

   pts  ast  rebs  new1  new2
0   25    5    11     2     3
1   12    7     8     1     0
2   15    7    10     2     7
3   14    9     6     8     2
4   19   12     6     3     4
5   23    9     5     7     7
6   25    9     9     7     5
7   29    4    12     6     3

Additional Resources

How to Stack Multiple Pandas DataFrames
How to Merge Two Pandas DataFrames on Index
How to Rename Columns in Pandas

Leave a Reply

Your email address will not be published. Required fields are marked *