In statistics, an observation is simply one occurrence of something you’re measuring. For example, suppose you’re measuring the weight of a certain species of turtle. Each…

# Author: Zach

To normalize the values in a dataset to be between 0 and 100, you can use the following formula: zi = (xi – min(x)) / (max(x)…

A Pearson Correlation Coefficient measures the linear association between two variables. It always takes on a value between -1 and 1 where: -1 indicates a…

Often in statistics we’re interested in answering questions like: What is the mean household income in a certain city? What is the mean weight of…

Boosting is a technique in machine learning that has been shown to produce models with high predictive accuracy. One of the most common ways to…

Occasionally you may want to drop the index column of a pandas DataFrame in Python. Since pandas DataFrames and Series always have an index, you…

Often in statistics we’re interested in collecting data so that we can answer some research question. For example, we might want to answer the following…

Most supervised machine learning algorithms are based on using a single predictive model like linear regression, logistic regression, ridge regression, etc. Methods like bagging and…

The normal distribution is the most commonly used distribution in all of statistics and is known for being symmetrical and bell-shaped. A closely related distribution…

R is one of the most popular programming languages for working with data. But before we can work with data, we have to actually get…