# How to Count Observations by Group in Pandas

Often you may be interested in counting the number of observations by group in a pandas DataFrame.

Fortunately this is easy to do using the groupby() and size() functions with the following syntax:

```df.groupby('column_name').size()
```

This tutorial explains several examples of how to use this function in practice using the following data frame:

```import numpy as np
import pandas as pd

#create pandas DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'B', 'B', 'B', 'C', 'C'],
'division':['E', 'W', 'E', 'E', 'W', 'W', 'E'],
'rebounds': [11, 8, 7, 6, 6, 5, 12]})

#display DataFrame
print(df)

team division  rebounds
0    A        E        11
1    A        W         8
2    B        E         7
3    B        E         6
4    B        W         6
5    C        W         5
6    C        E        12```

### Example 1: Count by One Variable

The following code shows how to count the total number of observations by team:

```#count total observations by variable 'team'
df.groupby('team').size()

team
A    2
B    3
C    2
dtype: int64
```

From the output we can see that:

• Team A has 2 observations
• Team B has 3 observations
• Team C has 2 observations

Note that the previous code produces a Series. In most cases we want to work with a DataFrame, so we can use the reset_index() function to produce a DataFrame instead:

```df.groupby('team').size().reset_index(name='obs')

team	obs
0	A	2
1	B	3
2	C	2
```

### Example 2: Count and Sort by One Variable

We can also use the sort_values() function to sort the group counts.

We can specify ascending=False to sort group counts from largest to smallest or ascending=True to sort from smallest to largest:

```df.groupby('team').size().reset_index(name='obs').sort_values(['obs'], ascending=True)

team	obs
0	A	2
2	C	2
1	B	3```

### Example 3: Count by Multiple Variables

We can also count the number of observations grouped by multiple variables in a pandas DataFrame:

```#count observations grouped by team and division
df.groupby(['team', 'division']).size().reset_index(name='obs')

team	division  obs
0	A	E	  1
1	A	W	  1
2	B	E	  2
3	B	W	  1
4	C	E	  1
5	C	W	  1
```

From the output we can see that:

• 1 observation belongs to Team A and division E
• 1 observation belongs to Team A and division W
• 2 observations belongs to Team B and division E
• 1 observation belongs to Team B and division W
• 1 observation belongs to Team C and division E
• 1 observation belongs to Team C and division W