# How to Create a Population Pyramid in R

This tutorial explains how to easily create a population pyramid in R.

## What is a Population Pyramid?

A population pyramid is a graph that shows the age and gender distribution of a given population. It is a useful chart for easily understanding the make-up of a population as well as the current trend in population growth.

If a population pyramid has a rectangular shape, it’s an indication that a population is growing at a slower rate; older generations are being replaced by new generations of roughly the same size.

If a population pyramid has a pyramid shape, it’s an indication that a population is growing at a faster rate; older generations are producing larger new generations.

Within the chart, the gender is shown on the left and right sides, the age is shown on the y-axis, and the percentage or amount of the population is shown on the x-axis.

Let’s walk through an example of how to create a population pyramid in R.

## Creating a Population Pyramid in R

Suppose we have the following dataset that shows the percentage make-up of a population according to age (0 to 100 years) and gender(M = “Male”, F = “Female”):

```#make this example reproducible
set.seed(1)

#create data frame
data <- data.frame(age = rep(1:100, 2), gender = rep(c("M", "F"), each = 100))

data\$population <- 1/sqrt(data\$age) * runif(200, 10000, 15000)

#convert population variable to percentage
data\$population <- data\$population / sum(data\$population) * 100

#view first six rows of dataset

#  age gender population
#1   1      M   2.424362
#2   2      M   1.794957
#3   3      M   1.589594
#4   4      M   1.556063
#5   5      M   1.053662
#6   6      M   1.266231

#view last six rows of dataset
tail(data)

#    age gender population
#195  95      F  0.2506803
#196  96      F  0.2829385
#197  97      F  0.2292992
#198  98      F  0.3070539
#199  99      F  0.2492992
#200 100      F  0.2977980
```

We can create a basic population pyramid for this dataset using the ggplot2 library:

```#load ggplot2
library(ggplot2)

#create population pyramid
ggplot(data, aes(x = age, fill = gender,
y = ifelse(test = gender == "M",
yes = -population, no = population))) +
geom_bar(stat = "identity") +
scale_y_continuous(labels = abs, limits = max(data\$population) * c(-1,1)) +
coord_flip()```

## Modifying the Aesthetics of a Population Pyramid in R

We can also modify the aesthetics of the plot to add titles, axis labels, axis ticks, colors, and more.

### Adding Titles & Labels

We can add both titles and axis labels to the population pyramid using the labs() argument:

```ggplot(data, aes(x = age, fill = gender,
y = ifelse(test = gender == "M",
yes = -population, no = population))) +
geom_bar(stat = "identity") +
scale_y_continuous(labels = abs, limits = max(data\$population) * c(-1,1)) +
labs(title = "Population Pyramid", x = "Age", y = "Percent of population") +
coord_flip()```

### Modifying the Colors

We can modify the two colors used to represent the genders by using the scale_colour_manual() argument:

```ggplot(data, aes(x = age, fill = gender,
y = ifelse(test = gender == "M",
yes = -population, no = population))) +
geom_bar(stat = "identity") +
scale_y_continuous(labels = abs, limits = max(data\$population) * c(-1,1)) +
labs(title = "Population Pyramid", x = "Age", y = "Percent of population") +
scale_colour_manual(values = c("pink", "steelblue"),
aesthetics = c("colour", "fill")) +
coord_flip()```

## Multiple Population Pyramids

It’s also possible to plot several population pyramids together using the facet_wrap() argument. For example, suppose we have demographic data for countries A, B, and C. The following code illustrates how to create one population pyramid for each country:

```#make this example reproducible
set.seed(1)

#create data frame
data_multiple <- data.frame(age = rep(1:100, 6),
gender = rep(c("M", "F"), each = 300),
country = rep(c("A", "B", "C"), each = 100, times = 2))

data_multiple\$population <- round(1/sqrt(data_multiple\$age)*runif(200, 10000, 15000), 0)

#view first six rows of dataset

#  age gender country population
#1   1      M       A      11328
#2   2      M       A       8387
#3   3      M       A       7427
#4   4      M       A       7271
#5   5      M       A       4923
#6   6      M       A       5916
#view last six rows of dataset
tail(data_multiple)

#    age gender country population
#595  95      F       C       1171
#596  96      F       C       1322
#597  97      F       C       1071
#598  98      F       C       1435
#599  99      F       C       1165
#600 100      F       C       1391

#create one population pyramid per country
ggplot(data_multiple, aes(x = age, fill = gender,
y = ifelse(test = gender == "M",
yes = -population, no = population))) +
geom_bar(stat = "identity") +
scale_y_continuous(labels = abs, limits = max(data_multiple\$population) * c(-1,1)) +
labs(y = "Population Amount") +
coord_flip() +
facet_wrap(~ country) +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) #rotate x-axis labels```

## Modifying the Theme

Lastly, we can modify the theme of the charts. For example, the following code uses theme_classic() to give the charts a more minimalist look:

```ggplot(data_multiple, aes(x = age, fill = gender,
y = ifelse(test = gender == "M",
yes = -population, no = population))) +
geom_bar(stat = "identity") +
scale_y_continuous(labels = abs, limits = max(data_multiple\$population) * c(-1,1)) +
labs(y = "Population Amount") +
coord_flip() +
facet_wrap(~ country) +
theme_classic() +
theme(axis.text.x = element_text(angle = 90, hjust = 1))
```

Or you can use custom ggthemes. For a complete list of ggthemes, check out the documentation page.