The **sample()** function in R allows you to take a random sample of elements from a dataset or a vector, either with or without replacement.

The basic syntax for the sample() function is as follows:

**sample(x, size, replace = FALSE, prob = NULL)**

**x**: a dataset or vector from which to choose the sample

**size**: size of the sample

**replace**: should sampling be with replacement? (this is FALSE by default)

**prob**: a vector of probability weights for obtaining the elements of the vector being sampled

*The complete documentation for cbind() can be found here.*

The following examples illustrate practical examples of using sample().

**Generating a Sample from a Vector**

Suppose we have vector *a *with 10 elements in it:

#define vectorawith 10 elements in it a <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

To generate a random sample of 5 elements from vector *a* without replacement, we can use the following syntax:

#generate random sample of 5 elements from vectorasample(a, 5) #[1] 3 1 4 7 5

It’s important to note that each time we generate a random sample, it’s likely that we will get a different set of elements each time.

#generate another random sample of 5 elements from vectorasample(a, 5) #[1] 1 8 7 4 2

If we would like to be able to replicate our results and work with the same sample each time, we can use **set.seed()**.

#set.seed(some random number) to ensure that we get the same sample each time set.seed(122) #define vectorawith 10 elements in it a <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) #generate random sample of 5 elements from vectorasample(a, 5) #[1] 10 9 2 1 4 #generate another random sample of 5 elements from vectorasample(a, 5) #[1] 10 9 2 1 4

We can also use the argument **replace = TRUE** so that we are sampling with replacement. This means that each element in the vector can be chosen to be in the sample more than once.

#generate random sample of 5 elements from vectorausing sampling with replacement sample(a, 5, replace = TRUE) # 10 10 2 1 6

**Generating a Sample from a Dataset**

Another common use of the sample() function is to generate a random sample of rows from a dataset. For the following example, we will generate a random sample of 10 rows from the built-in R dataset **iris**, which has 150 total rows.

#view first 6 rows of iris datasethead(iris) # Sepal.Length Sepal.Width Petal.Length Petal.Width Species #1 5.1 3.5 1.4 0.2 setosa #2 4.9 3.0 1.4 0.2 setosa #3 4.7 3.2 1.3 0.2 setosa #4 4.6 3.1 1.5 0.2 setosa #5 5.0 3.6 1.4 0.2 setosa #6 5.4 3.9 1.7 0.4 setosa#setseed to ensure that this example is replicableset.seed(100) #choose a random vector of 10 elements from all 150 rows in iris dataset sample_rows <- sample(1:nrow(iris), 10) sample_rows #[1] 47 39 82 9 69 71 117 53 78 25 #choose the 10 rows of the iris dataset that match the row numbers above sample <- iris[sample_rows, ] sample # Sepal.Length Sepal.Width Petal.Length Petal.Width Species #47 5.1 3.8 1.6 0.2 setosa #39 4.4 3.0 1.3 0.2 setosa #82 5.5 2.4 3.7 1.0 versicolor #9 4.4 2.9 1.4 0.2 setosa #69 6.2 2.2 4.5 1.5 versicolor #71 5.9 3.2 4.8 1.8 versicolor #117 6.5 3.0 5.5 1.8 virginica #53 6.9 3.1 4.9 1.5 versicolor #78 6.7 3.0 5.0 1.7 versicolor #25 4.8 3.4 1.9 0.2 setosa

Note that if you copy and paste the above code in your own R console, you should get the exact same sample since we used** set.seed(100)** to ensure that we get the same sample each time.