You can use the following methods to group DataFrame rows into a list using GroupBy in pandas:
Method 1: Group Rows into List for One Column
df.groupby('group_var')['values_var'].agg(list).reset_index(name='values_var')
Method 2: Group Rows into List for Multiple Columns
df.groupby('team').agg(list)
The following examples show how to use each method in practice with the following pandas DataFrame:
import pandas as pd #create DataFrame df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'B', 'B', 'C', 'C', 'C'], 'points': [10, 10, 12, 15, 19, 23, 20, 20, 26], 'assists': [6, 8, 9, 11, 13, 8, 8, 15, 10]}) #view DataFrame print(df) team points assists 0 A 10 6 1 A 10 8 2 A 12 9 3 A 15 11 4 B 19 13 5 B 23 8 6 C 20 8 7 C 20 15 8 C 26 10
Example 1: Group Rows into List for One Column
We can use the following syntax to group rows by the team column and product one list for the values in the points column:
#group points values into list by team
df.groupby('team')['points'].agg(list).reset_index(name='points')
team points
0 A [10, 10, 12, 15]
1 B [19, 23]
2 C [20, 20, 26]
We can see that a list of points values is produced for each unique team in the DataFrame.
Example 2: Group Rows into List for Multiple Columns
We can use the following syntax to group rows by the team column and product a list of values for both the points and assists columns:
#group points and assists values into lists by team
df.groupby('team').agg(list)
points assists
team
A [10, 10, 12, 15] [6, 8, 9, 11]
B [19, 23] [13, 8]
C [20, 20, 26] [8, 15, 10]
We can see that a list of points values and a list of assists values are produced for each unique team in the DataFrame.
Note: You can find the complete documentation for the GroupBy operation in pandas here.
Additional Resources
The following tutorials explain how to perform other common operations in pandas:
Pandas: How to Calculate Cumulative Sum by Group
Pandas: How to Count Unique Values by Group
Pandas: How to Calculate Mode by Group
Pandas: How to Calculate Correlation By Group