# How to Use the stat.desc() Function in R

Often you may want to create a table that contains descriptive statistics for variables in a data frame in R.

One of the best ways to do so is by using the stat.desc() function from the pastecs package in R, which can be used to perform this exact task.

The stat.desc() function uses the following syntax:

stat.desc(x, basic=TRUE, desc=TRUE, norm=FALSE, p=0.95)

where:

• x: Name of data frame
• basic: Whether to return basic statistics or not
• desc: Whether to return more advanced statistics or not
• norm: Whether to return normal distribution statistics or not
• p: The p-value to use when calculating confidence interval values

The following example shows how to use the stat.desc() function in practice in R.

## Example: How to Use the stat.desc() Function in R

Suppose that we create a data frame in R that contains information about various basketball players including their team name, total points scored and total assists:

```#create data frame
df <- data.frame(team=c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'),
points=c(22, 39, 24, 18, 15, 10, 28, 23),
assists=c(3, 8, 8, 6, 10, 14, 8, 17))

#view data frame
df

team points assists
1    A     22       3
2    A     39       8
3    A     24       8
4    A     18       6
5    B     15      10
6    B     10      14
7    B     28       8
8    B     23      17
```

Suppose that we would like to calculate descriptive statistics for each of the columns in the data frame.

We can use the stat.desc() function from the pastecs package to do so:

```library(pastecs)

#calculate table of descriptive statistics for each column in data frame
stat.desc(df)

team      points   assists
nbr.val    NA   8.0000000  8.000000
nbr.null   NA   0.0000000  0.000000
nbr.na     NA   0.0000000  0.000000
min        NA  10.0000000  3.000000
max        NA  39.0000000 17.000000
range      NA  29.0000000 14.000000
sum        NA 179.0000000 74.000000
median     NA  22.5000000  8.000000
mean       NA  22.3750000  9.250000
SE.mean    NA   3.0991790  1.566958
CI.mean    NA   7.3283939  3.705267
var        NA  76.8392857 19.642857
std.dev    NA   8.7658021  4.432026
coef.var   NA   0.3917677  0.479138
```

The stat.desc() function returns a table of descriptive statistics for each of the columns in the data frame.

Notice that each of the values in the team column is shown as NA since we are not able to calculate numerical descriptive statistics for a column that only contains character values.

Instead, we may choose to use the following syntax to only calculate the descriptive statistics for the points and assists column in the data frame:

```library(pastecs)

#calculate table of descriptive statistics for points and assists columns
stat.desc(df[c('points', 'assists')])

points   assists
nbr.val        8.0000000  8.000000
nbr.null       0.0000000  0.000000
nbr.na         0.0000000  0.000000
min           10.0000000  3.000000
max           39.0000000 17.000000
range         29.0000000 14.000000
sum          179.0000000 74.000000
median        22.5000000  8.000000
mean          22.3750000  9.250000
SE.mean        3.0991790  1.566958
CI.mean.0.95   7.3283939  3.705267
var           76.8392857 19.642857
std.dev        8.7658021  4.432026
coef.var       0.3917677  0.479138
```

Here is how to interpret each value in the output:

• nbr.val: Number of values
• nbr.null: Number of null values
• nbr.na: Number of NA values
• min: Minimum value
• max: Maximum value
• range: Range (max – min) of values
• sum: Sum of values
• median: Median value
• mean: Mean value
• S.E. mean: Standard error of mean value
• CI mean .95: 95% confidence interval for mean value
• var: Variance of values
• std.dev: Standard deviation of values
• coef.var: Coefficient of variation of values

Note that you can use the p argument of the stat.desc() function to use a different confidence level when calculating the values that contain confidence interval limits.