# The Easiest Way to Create Summary Tables in R

The easiest way to create summary tables in R is to use the describe() and describeBy() functions from the psych library.

```library(psych)

#create summary table
describe(df)

#create summary table, grouped by a specific variable
describeBy(df, group=df\$var_name)
```

The following examples show how to use these functions in practice.

### Example 1: Create Basic Summary Table

Suppose we have the following data frame in R:

```#create data frame
df <- data.frame(team=c('A', 'A', 'B', 'B', 'C', 'C', 'C'),
points=c(15, 22, 29, 41, 30, 11, 19),
rebounds=c(7, 8, 6, 6, 7, 9, 13),
steals=c(1, 1, 2, 3, 5, 7, 5))

#view data frame
df

team points rebounds steals
1    A     15        7      1
2    A     22        8      1
3    B     29        6      2
4    B     41        6      3
5    C     30        7      5
6    C     11        9      7
7    C     19       13      5```

We can use the describe() function to create a summary table for each variable in the data frame:

```library(psych)

#create summary table
describe(df)

vars n  mean    sd median trimmed   mad min max range  skew kurtosis
team*       1 7  2.14  0.90      2    2.14  1.48   1   3     2 -0.22    -1.90
points      2 7 23.86 10.24     22   23.86 10.38  11  41    30  0.33    -1.41
rebounds    3 7  8.00  2.45      7    8.00  1.48   6  13     7  1.05    -0.38
steals      4 7  3.43  2.30      3    3.43  2.97   1   7     6  0.25    -1.73
se
team*    0.34
points   3.87
rebounds 0.93
steals   0.87
```

Here’s how to interpret each value in the output:

• vars: column number
• n: Number of valid cases
• mean: The mean value
• median: The median value
• trimmed: The trimmed mean (default trims 10% of observations from each end)
• mad: The median absolute deviation (from the median)
• min: The minimum value
• max: The maximum value
• range: The range of values (max – min)
• skew: The skewness
• kurtosis: The kurtosis
• se: The standard error

It’s important to note that any variable with an asterisk (*) symbol next to it is a categorical or logical variable that has been converted to a numerical variable with values that represent the numerical ordering of the values.

In our example, the variable ‘team’ has been converted to a numerical variable so we shouldn’t interpret the summary statistics for it literally.

Also note that you can use the argument fast=TRUE to only calculate the most common summary statistics:

```#create smaller summary table
describe(df, fast=TRUE)

vars n  mean    sd min  max range   se
team        1 7   NaN    NA Inf -Inf  -Inf   NA
points      2 7 23.86 10.24  11   41    30 3.87
rebounds    3 7  8.00  2.45   6   13     7 0.93
steals      4 7  3.43  2.30   1    7     6 0.87```

We can also choose to only compute the summary statistics for certain variables in the data frame:

```#create summary table for just 'points' and 'rebounds' columns
describe(df[ , c('points', 'rebounds')], fast=TRUE)

vars n  mean    sd min max range   se
points      1 7 23.86 10.24  11  41    30 3.87
rebounds    2 7  8.00  2.45   6  13     7 0.93```

### Example 2: Create Summary Table, Grouped by Specific Variable

The following code shows how to use the describeBy() function to create a summary table for the data frame, grouped by the ‘team’ variable:

```#create summary table, grouped by 'team' variable
describeBy(df, group=df\$team, fast=TRUE)

Descriptive statistics by group
group: A
vars n mean   sd min  max range  se
team        1 2  NaN   NA Inf -Inf  -Inf  NA
points      2 2 18.5 4.95  15   22     7 3.5
rebounds    3 2  7.5 0.71   7    8     1 0.5
steals      4 2  1.0 0.00   1    1     0 0.0
------------------------------------------------------------
group: B
vars n mean   sd min  max range  se
team        1 2  NaN   NA Inf -Inf  -Inf  NA
points      2 2 35.0 8.49  29   41    12 6.0
rebounds    3 2  6.0 0.00   6    6     0 0.0
steals      4 2  2.5 0.71   2    3     1 0.5
------------------------------------------------------------
group: C
vars n  mean   sd min  max range   se
team        1 3   NaN   NA Inf -Inf  -Inf   NA
points      2 3 20.00 9.54  11   30    19 5.51
rebounds    3 3  9.67 3.06   7   13     6 1.76
steals      4 3  5.67 1.15   5    7     2 0.67
```

The output shows the summary statistics for each of the three teams in the data frame.