# How to Convert Categorical Variable to Numeric in Pandas

You can use the following basic syntax to convert a categorical variable to a numeric variable in a pandas DataFrame:

```df['column_name'] = pd.factorize(df['column_name'])
```

You can also use the following syntax to convert every categorical variable in a DataFrame to a numeric variable:

```#identify all categorical variables
cat_columns = df.select_dtypes(['object']).columns

#convert all categorical variables to numeric
df[cat_columns] = df[cat_columns].apply(lambda x: pd.factorize(x))
```

The following examples show how to use this syntax in practice.

### Example 1: Convert One Categorical Variable to Numeric

Suppose we have the following pandas DataFrame:

```import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C'],
'position': ['G', 'G', 'F', 'G', 'F', 'C', 'G', 'F', 'C'],
'points': [5, 7, 7, 9, 12, 9, 9, 4, 13],
'rebounds': [11, 8, 10, 6, 6, 5, 9, 12, 10]})

#view DataFrame
df

team	position points	rebounds
0	A	G	 5	11
1	A	G	 7	8
2	A	F	 7	10
3	B	G	 9	6
4	B	F	 12	6
5	B	C	 9	5
6	C	G	 9	9
7	C	F	 4	12
8	C	C	 13	10
```

We can use the following syntax to convert the ‘team’ column to numeric:

```#convert 'team' column to numeric
df['team'] = pd.factorize(df['team'])

#view updated DataFrame
df

team	position points	rebounds
0	0	G	 5	11
1	0	G	 7	8
2	0	F	 7	10
3	1	G	 9	6
4	1	F	 12	6
5	1	C	 9	5
6	2	G	 9	9
7	2	F	 4	12
8	2	C	 13	10
```

Here is how the conversion worked:

• Each team that had a value of ‘A‘ was converted to 0.
• Each team that had a value of ‘B‘ was converted to 1.
• Each team that had a value of ‘C‘ was converted to 2.

### Example 2: Convert Multiple Categorical Variables to Numeric

Once again suppose we have the following pandas DataFrame:

```import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C'],
'position': ['G', 'G', 'F', 'G', 'F', 'C', 'G', 'F', 'C'],
'points': [5, 7, 7, 9, 12, 9, 9, 4, 13],
'rebounds': [11, 8, 10, 6, 6, 5, 9, 12, 10]})

#view DataFrame
df

team	position points	rebounds
0	A	G	 5	11
1	A	G	 7	8
2	A	F	 7	10
3	B	G	 9	6
4	B	F	 12	6
5	B	C	 9	5
6	C	G	 9	9
7	C	F	 4	12
8	C	C	 13	10
```

We can use the following syntax to convert every categorical variable in the DataFrame to a numeric variable:

```#get all categorical columns
cat_columns = df.select_dtypes(['object']).columns

#convert all categorical columns to numeric
df[cat_columns] = df[cat_columns].apply(lambda x: pd.factorize(x))

#view updated DataFrame
df

team	position points	rebounds
0	0	0	 5	11
1	0	0	 7	8
2	0	1	 7	10
3	1	0	 9	6
4	1	1	 12	6
5	1	2	 9	5
6	2	0	 9	9
7	2	1	 4	12
8	2	2	 13	10
```

Notice that the two categorical columns (team and position) both got converted to numeric while the points and rebounds columns remained the same.

Note: You can find the complete documentation for the pandas factorize() function here.