Frequency Tables in R

Frequency tables in R

This tutorial explains how to create and work with frequency tables in R.

Frequency Tables: The Basics

A frequency table is a table that shows how many times certain values occur in a data set.

For example, suppose we have the following dataset:

Example of a frequency table

To make a frequency table of the variable Favorite Color, we simply count how many times each color shows up:

Frequency table from raw data

Frequency Tables in R

We can easily create a frequency table for any variable in a dataset in R using the built-in table() function.

For all of the following examples, we will use the built-in R dataset mtcars, which contains data about the features of 32 different cars:

#view first six rows of mtcars
head(mtcars)

#                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
#Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
#Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

One-Way Frequency Table

We can create a one-way frequency table that shows how often each value of gear shows up in the dataset:

table(mtcars$gear)

# 3  4  5 
#15 12  5 

This frequency table tells us that three distinct values for gear appear in the dataset: 3, 4, 5. From the table, we can see that the value 3 occurs 15 times, 4 occurs 12 times, and occurs 5 times. 

Two-Way Frequency Table

We can also create a two-way frequency table, which shows how often pairs of two variables show up in the dataset. For example, we can create a two-way frequency table that shows how often pairs of am and gear show up:

attach(mtcars)
table(am, gear)

#   gear
#am   3  4  5
#  0 15  4  0
#  1  0  8  5

From the table we can see that the variable am has two values that appear in the dataset (and 1) and the variable gear has three values that appear in the dataset(3, 4, and 5). The way to interpret the values in the table is as follows:

  • There are 15 cars in the dataset that have a value of for gear and a value of for am.
  • There are 4 cars in the dataset that have a value of for gear and a value of for am.
  • And so on…

Notice that the total values in the table add up to 32, which matches the total number of rows in the dataset.

Three-Way Frequency Table

We can also create a three-way frequency table by using the ftable() function. For example, we can create a three -way frequency table for the variables vs, gear, and am:

attach(mtcars)
three_way <- table(vs, gear, am)
ftable(three_way)

#        am  0  1
#vs gear         
#0  3       12  0
#   4        0  2
#   5        0  4
#1  3        3  0
#   4        4  6
#   5        0  1

The way to interpret the values in the table is as follows:

  • There are 12 cars in the dataset that have a value of for am, a value of for vs, and a value of for gear.

Notice that the total values in the table once again add up to 32, which matches the total number of rows in the dataset.

Tables of Proportions

We can use the built-in R function prop.table() to create tables that show proportions.

mtcars_table <- table(mtcars$am, mtcars$gear)
mtcars_table

#   gear
#am   3  4  5
#  0 15  4  0
#  1  0  8  5

#create proportion table that sums over the rows
prop.table(mtcars_table, 1)

#            3         4         5
#  0 0.7894737 0.2105263 0.0000000
#  1 0.0000000 0.6153846 0.3846154

#create proportion table that sums over the columns
prop.table(mtcars_table, 2)

#            3         4         5
#  0 1.0000000 0.3333333 0.0000000
#  1 0.0000000 0.6666667 1.0000000

Tables of Marginal Frequencies

We can use the built-in R function margin.table() to create tables that show marginal frequencies.

mtcars_table <- table(mtcars$am, mtcars$gear)
mtcars_table

#   gear
#am   3  4  5
#  0 15  4  0
#  1  0  8  5

#create table that shows marginal frequencies of rows
margin.table(mtcars_table, 1)

# 0  1 
#19 13 

#create table that shows marginal frequencies of columns
margin.table(mtcars_table, 2)

# 3  4  5 
#15 12  5 

Chi-Square Test of Independence

We can conduct a chi-square test of independence to test for the independence of the row and column variables by using the chisq.test() function:

mtcars_table <- table(mtcars$am, mtcars$gear)

chisq.test(mtcars_table)

#	Pearson's Chi-squared test
#
#data:  mtcars_table
#X-squared = 20.945, df = 2, p-value = 2.831e-05

Fisher’s Exact Test

We can also conduct Fisher’s Exact Test using the fisher.test() function, which is used in place of a Chi Square Test in 2×2 tables when the sample sizes are small.

mtcars_table <- table(mtcars$am, mtcars$gear)

fisher.test(mtcars_table)

#	Fisher's Exact Test for Count Data
#
#data:  mtcars_table
#p-value = 2.13e-06
#alternative hypothesis: two.sided

Visualizing Frequency Tables

We can visualize a one-way frequency table using the barplot() function:

gears_table <- table(mtcars$gear)

barplot(gears_table,
        main = 'Gear Frequency',
        xlab = '# Gears',
        ylab = 'Frequency')

Barplot in R

We can visualize a two-way frequency table using the mosaicplot() function:

mtcars_table <- table(mtcars$am, mtcars$gear) 

mosaicplot(mtcars_table)

Mosaic plot for two-way table in R

We can also flip the columns and rows of the mosaic plot using the sort() function:

mtcars_table <- table(mtcars$am, mtcars$gear) 

mosaicplot(mtcars_table, sort=c(2, 1))

Flipped mosaic plot in R

We can also add some style to the mosaic plot to make it more aesthetically pleasing:

mtcars_table <- table(mtcars$am, mtcars$gear) 

mosaicplot(mtcars_table,
           main = 'Frequencies of "am" and "gear"',
           xlab = 'am',
           ylab = 'gear',
           col = 'steelblue')

Mosaic plot with custom style in R

Leave a Reply

Your email address will not be published. Required fields are marked *