Tech Notes

My notes on Statistics, Big Data, Cloud Computing, Cyber Security

Random Variables and Probability Distributions

Random Variable
The outcome of an experiment need not be a number, for example, the outcome when a coin is tossed can be ‘heads’ or ‘tails’. However, we often want to represent outcomes as numbers. A random variable is a function that associates a unique numerical value with every outcome of an experiment. The value of the random variable will vary from trial to trial as the experiment is repeated.

There are two types of random variable – discrete and continuous.

A random variable has either an associated probability distribution (discrete random variable) or probability density function (continuous random variable).


A coin is tossed ten times. The random variable X is the number of tails that are noted. X can only take the values 0, 1, …, 10, so X is a discrete random variable.
A light bulb is burned until it burns out. The random variable Y is its lifetime in hours. Y can take any positive real value, so Y is a continuous random variable.

Discrete Random Variable
A discrete random variable is one which may take on only a countable number of distinct values such as 0, 1, 2, 3, 4

Continuous Random Variable
A continuous random variable is one which takes an infinite number of possible values. Continuous random variables are usually measurements. Examples include height, weight, the amount of sugar in an orange

Expected Value (Population Mean)

The expected value (or population mean) of a random variable indicates its average or central value. It is a useful summary value (a number) of the variable’s distribution.

Two random variables with the same expected value can have very different distributions. There are other useful descriptive measures which affect the shape of the distribution, for example population variance

The expected value of a random variable X is symbolised by E(X) or µ.


Population Variance
is a non-negative number which gives an idea of how widely spread the values of the random variable are likely to be; the larger the variance, the more scattered the observations on average


Probability Distribution( Probability function / probability mass function.)
A list of probabilities associated with each of its possible values.

Probability Density Function
Is a function which can be integrated to obtain the probability that the continuous random variable takes a value in a given interval. It is to  best to think of it as a continuous distribution as the area of graph (below) of a function – called density function


If the interval is between 0 & 1 – it is a uniform density function. Other types could be exponential density function, normal (gaussian) density function etc.

probability that a random variable is between a value is the grey area under the graph. Eg from the graph. The probability that the random variable is between the interval 2 & 3 is the grey area

Central Limit Theorem
states that whenever a random sample of size n is taken from any distribution with mean µ and variance sigma^2, then the sample mean x_bar will be approximately normally distributed with mean µ and variance sigma^2/n. The larger the value of the sample size n, the better the approximation to the normal.

This is very useful when it comes to inference. For example, it allows us (if the sample size is fairly large) to use hypothesis tests which assume normality even if our data appear non-normal. This is because the tests use the sample mean x_bar, which the Central Limit Theorem tells us will be approximately normally distributed.

Law of large numbers 

Is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials should be close to the expected value, and will tend to become closer as more trials are performed.

The LLN is important because it “guarantees” stable long-term results for the averages of random events. For example, while a casino may lose money in a single spin of the roulette wheel, its earnings will tend towards a predictable percentage over a large number of spins. Any winning streak by a player will eventually be overcome by the parameters of the game. It is important to remember that the LLN only applies (as the name indicates) when a large number of observations are considered.

Disclaimer : These are my study notes – online – instead of on paper so that others can benefit. In the process I’ve have used some pictures / content from other original authors. All sources / original content publishers are listed below and they deserve credit for their work. No copyright violation intended.

Referencesfor these notes :

The study material for the MOOC “Making sense of data” at

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: