Pandas: How to Apply Function to Multiple Columns


Often you may want to create a function that you can apply to multiple columns in a pandas DataFrame.

The easiest way to do this is by using the lambda function inside of the apply() function in pandas.

You can use the following basic syntax to do so:

df['new_col'] = df.apply(lambda x: f(x.points, x.assists), axis=1)

This particular example applies the function named f to the points and assists column of the DataFrame and stores the results in a new column named new_col.

Note that we used the argument axis=1 to specify that the function should be applied across the columns of the DataFrame as opposed to the rows.

Feel free to instead specify axis=0 to apply the function across the rows of the DataFrame instead.

The following example shows how to use the apply and lambda functions in practice to apply a function to multiple columns in a pandas DataFrame.

Example: How to Apply Function to Multiple Columns in Pandas

Suppose we have the following pandas DataFrame that contains information about various basketball players:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'B', 'B', 'C', 'C', 'C'],
                   'points': [12, 15, 29, 22, 30, 41, 12],
                   'assists': [8, 10, 11, 11, 7, 14, 18]})

#view DataFrame
print(df)

  team  points  assists
0    A      12        8
1    A      15       10
2    B      29       11
3    B      22       11
4    C      30        7
5    C      41       14
6    C      12       18

Suppose that we would like to define a function that performs the following operation:

  • First, multiply the corresponding values in the points and assists columns.
  • Then, divide the result by 3.

We can use the following syntax to define this function:

def f(first, second):
    return (first*second) / 3

Note that we used the def statement to define the function and name it f, then we used the return statement inside the function to return a specific result.

Now suppose that we would like to apply this function to the points and assists columns in the DataFrame.

We can use the following syntax to do so:

#apply function to multiple columns in DataFrame
df['new_col'] = df.apply(lambda x: f(x.points, x.assists), axis=1)

#view updated DataFrame
print(df)

  team  points  assists     new_col
0    A      12        8   32.000000
1    A      15       10   50.000000
2    B      29       11  106.333333
3    B      22       11   80.666667
4    C      30        7   70.000000
5    C      41       14  191.333333
6    C      12       18   72.000000

The new_col column contains the results of the function that we applied to the points and assists columns of the DataFrame.

For example, we can see:

  • The first value is (12 * 8) / 3 = 32.
  • The first value is (15 * 10) / 3 = 50.
  • The first value is (29 * 11) / 3 = 106.33.
  • The first value is (22 * 11) / 3 = 80.67.

And so on.

Note that we specified that two arguments could be provided to our function named f. In this particular example, we chose to use the columns in the pandas DataFrame named points and assists as arguments in our custom function.

Feel free to create a function as complex as you would like and then use the apply and lambda functions to apply your function to multiple columns in the pandas DataFrame.

Note: You can find the complete documentation for the apply function in pandas here.

Additional Resources

The following tutorials explain how to perform other common operations in pandas:

How to Apply Function to Pandas Groupby
How to Perform a GroupBy Sum in Pandas
How to Use Groupby and Plot in Pandas

Featured Posts

Leave a Reply

Your email address will not be published. Required fields are marked *