Random Variables and Probability Distributions

A random variable represents the result of some random process, like flipping a coin, rolling dice, or spinning a bottle. They can be discrete and take on only certain values (like heads or tails, or 1, 2, 3, 4, 5 or 6 on a die) or continuous and take on any real value (like the […]

The Central Limit Theorem

The Central Limit Theorem is a pretty important concept in statistics. It states that, even if the original probability distribution isn’t normal, the mean of the samples taken from this distribution is distributed normally as the number of samples increases. What does this mean? Let’s try an experiment. You’re going to need python (I’m using […]

(Re)Learning Machine Learning – Logistic Regression

This time we’re going to do logistic regression. Logistic regression uses the same idea as linear regression (see my previous post) to classify input as one of two different classes. Instead of a linear function (one that looks like a straight line) we use a log-based function (making it “logistic”), and even though we’re using […]