An Easy Guide to Writing Functions in R (With Examples)

A guide to writing functions in R

R has many built-in functions along with several others that you can access by installing and loading various packages. This means whenever you use R, you’re already using functions whether you realize it or not. This guide will teach you how to actually write your own functions and explain why they can be helpful in both writing and reading code.

The Basic Syntax of a Function

Functions in R use the following syntax:

function.name <- function(arg1, arg2, arg3 = 10, ...) {
  variable1 = arg1 * arg2 #some calculations
  variable1 / arg3 #return some value
}

function.name – the name of the function. You can name a function whatever you’d like, but you should be careful to not use names already used elsewhere in R like function, plot, etc.

arg1, arg2, arg3 – arguments of the function. Your function can have any amount of arguments you’d like. Note that you can specify default values for some arguments. For example, arg3 has a specified value of 10 in the example above. 

– the ellipses argument is an optional argument that allows for other arguments to be passed into the function and onto another function. This is often used for plotting.

Function body – the code within the brackets {} is run every time the function is called. In general, it’s a good idea to write functions that are fairly short and do one specific thing. Note that the lines within the function body are typically indented by two spaces to make the function easier to read.

Return value – the last line of code within the brackets {} provides the value that will be returned by the function. In the example above, the value for variable1/ arg3 will be returned by the function. Note that a function doesn’t have to return any value. For example, functions that generate plots may not return any mathematical values. Also note that you may type out return variable1/arg3 in the function above, but it’s not required.

The following example illustrates a function that takes three numbers as arguments and returns the product of the three numbers:

#define the function
multiply_three_numbers <- function(num1, num2, num3) {
  num1 * num2 * num3
}

#call the function
multiply_three_numbers(1, 2, 3) 

#[1] 6

The function above used the three numbers provided as arguments and multiplied them together to return 1 * 2 *3  = 6.

Using the Correct Number of Arguments

Note that if the number of arguments provided to the function does not match the number of arguments defined in the function, an error will occur:

#call the function with too few arguments
multiply_three_numbers(1, 2) 

#Error in multiply_three_numbers(1, 2) : 
#  argument "num3" is missing, with no default

#call the function with too many arguments
multiply_three_numbers(1, 2, 3, 4) 

#Error in multiply_three_numbers(1, 2, 3, 4) : unused argument (4)

Examples of Functions

The following examples show several different functions in action.

Function: Find Descriptive Statistics

The following function takes a vector as an argument and outputs a string that shows the mean, standard deviation, and number of elements in the vector:

#define function
descriptive_stats <- function(x) {
  mean = round(mean(x), 1)
  sd = round(sd(x), 1)
  length = length(x)
  paste("mean:", mean, "standard deviation:",  sd, "count:", length)
}

#run function
numbers <- c(12, 14, 15, 38, 37, 35, 24, 2, 6, 5)
descriptive_stats(numbers)

#[1] "mean: 18.8 standard deviation: 13.8 count: 10"

The following function performs the same calculations as the previous function, except it outputs the results as a vector:

#define function
descriptive_stats <- function(x) {
  mean = round(mean(x), 1)
  sd = round(sd(x), 1)
  length = length(x)
  c(mean, sd, length)
}

#run function
numbers <- c(12, 14, 15, 38, 37, 35, 24, 2, 6, 5)
descriptive_stats(numbers)

#[1] 18.8 13.8 10.0

#access the mean from the vector
descriptive_stats(numbers)[1]

#[1] 18.8 

#access the standard deviation from the vector
descriptive_stats(numbers)[1]

#[1] 13.8 

#access the length from the vector
descriptive_stats(numbers)[3]

#[1] 10

Function: Extract the last characters from a string

The following function takes a string and a number n as an argument and outputs the last characters of the string:

#define function
last_n_characters <- function(string, n){
  substr(string, nchar(string)-n+1, nchar(string))
}

#use function to find last 4 characters of a string
x <- "I like to walk in the park" 
last_n_characters(x, 4)

#[1] "park"

#use function to find last 7 characters of a string
x <- "My favorite color is blue"
last_n_characters(x, 7)

#[1] "is blue"

Function: Find Z Scores

A z-score tells you how many standard deviations away from the mean a certain value lies. The following function calculates the z-scores for each value in a vector:

#define function
find_z_scores <- function(x){
  (x- mean(x)) / sd(x)
}

#use function to find z-score of each value in vector x
x <- c(1, 14, 7, 5, 23, 12, 4, 5, 7, 4) 
find_z_scores(x)

#[1] -1.1115724  0.8954333 -0.1852621 -0.4940322  2.2848988  0.5866632
#[7] -0.6484172 -0.4940322 -0.1852621 -0.6484172

Function: Create a Blue Plot with pre-defined axis labels

The following function creates a scatterplot with blue dots and pre-defined axis labels. Using the argument . . . , we tell R to allow this function to also use other built-in plot functions as well.

#define function
blue_plot <- function(x, y, ...) {
  plot(x, y, col = 'blue', xlab = 'X-axis', ylab = 'Y-axis', ...)
}

#run function
blue_plot(1:10, 1:10, pch = 17, main = 'A plot of blue triangles')

A plot generated by a function in R

Notice that we didn’t include pch or main explicitly in the arguments of the function, but by using the ellipses (  . . . )argument, we were able to use pch and main anyway.

What Makes a Good Function?

A good function in R has three properties: It is (reasonably) short, performs a single operation, and has an intuitive name.

Short – The point of functions is to make R code easier and more convenient to both read and write. If a function is several hundred or thousands of lines long, that largely defeats the purpose of using a function.

Performs a single operation – A function should ideally perform a single operation to keep things as simple as possible.

Uses intuitive names – A function that multiplies three numbers together should possess an intuitive name like multiply_three_numbers as opposed to something vague like m3. This makes it much more convenient to understand what a function does and it allows for more efficient implementation of the function within an R script.

Functions Increase Productivity

Functions increase productivity by allowing you to write a function one time and simply use it over and over again to perform a single operation. Functions also make an R script more readable and user friendly.

Additional Resources

To learn more about functions in R, check out the following resources:

Hadley Wickham – Functions
R Documentation – Function Definition

Leave a Reply

Your email address will not be published. Required fields are marked *