How to Calculate Relative Frequency in Python


Relative frequency measures how frequently a certain value occurs in a dataset relative to the total number of values in a dataset.

You can use the following function in Python to calculate relative frequencies:

def rel_freq(x):
    freqs = [(value, x.count(value) / len(x)) for value in set(x)] 
    return freqs

The following examples show how to use this function in practice.

Example 1: Relative Frequencies for a List of Numbers

The following code shows how to use this function to calculate relative frequencies for a list of numbers:

#define data
data = [1, 1, 1, 2, 3, 4, 4]

#calculate relative frequencies for each value in list
rel_freq(data)

[(1, 0.42857142857142855),
 (2, 0.14285714285714285),
 (3, 0.14285714285714285),
 (4, 0.2857142857142857)]

The way to interpret this output is as follows:

  • The value “1” has a relative frequency of 0.42857 in the dataset.
  • The value “2” has a relative frequency of 0.142857 in the dataset.
  • The value “3” has a relative frequency of 0.142857 in the dataset.
  • The value “4” has a relative frequency of 0.28571 in the dataset.

You’ll notice that all of the relative frequencies add up to 1.

Example 2: Relative Frequencies for a List of Characters

The following code shows how to use this function to calculate relative frequencies for a list of characters:

#define data
data = ['a', 'a', 'b', 'b', 'c']

#calculate relative frequencies for each value in list
rel_freq(data)

[('a', 0.4), ('b', 0.4), ('c', 0.2)]

The way to interpret this output is as follows:

  • The value “a” has a relative frequency of 0.4 in the dataset.
  • The value “b” has a relative frequency of 0.4 in the dataset.
  • The value “c” has a relative frequency of 0.2 in the dataset.

Once again, all of the relative frequencies add up to 1.

Example 3: Relative Frequencies for a Column in a pandas DataFrame

The following code shows how to use this function to calculate relative frequencies for a specific column in a pandas DataFrame:

import pandas as pd

#define data
data = pd.DataFrame({'A': [25, 15, 15, 14, 19],
                     'B': [5, 7, 7, 9, 12],
                     'C': [11, 8, 10, 6, 6]})

#calculate relative frequencies of values in column 'A'
rel_freq(list(data['A']))

[(25, 0.2), (19, 0.2), (14, 0.2), (15, 0.4)]

The way to interpret this output is as follows:

  • The value “25” has a relative frequency of 0.2 in the column.
  • The value “19” has a relative frequency of 0.2 in the column.
  • The value “14” has a relative frequency of 0.2 in the column.
  • The value “15” has a relative frequency of 0.4 in the column.

Once again, all of the relative frequencies add up to 1.

Additional Resources

Relative Frequency Calculator
Relative Frequency Histogram: Definition + Example
How to Calculate Relative Frequency in Excel

Leave a Reply

Your email address will not be published.