Random Variables

Random Variables

Random what?

For me, randomness was about something that we could not predict that it was going to happen at all. And when I put it together with the concept of variable that I consider to be an element that can take any possible value under minimal circumstances ... I just could not understand at first what those statistics texts were referring to when mentioning random variable. Is it like any possible value to any random situation? Wait... what?

But thankfully now I do know what they mean. HOORAY (if you have ever watched StatQuest you will get this expression. This channel is like a statistical bible in the form of videos. And I am totally taking this expression throughout the rest of my life (and posts). ☺️

Random Variables

A random variable is just a variable whose value we don't know. It does not mean that we don't know the values (or the range of values) it can take (in fact we do!) And for each value (or range) we can associate a probability for it to happen.

The random variable is actually the set of outcomes of the event(situation/experiment) we are interested in. Hence, knowing the event, we can choose what outcomes we want to analyze. Here we go with some examples:

EventOutcome we are interested in (Random Variable)
Flip a coin 5 timesnumber of heads
Powerlifting Championshipcontestants scores
Tossing two dicessum of the dices scores
Exercising weeklyhow many days per week I exercise

A random variable is represented with a capital letter such as X and it has the following possible values: x₁, x₂, x₃, ..., xₙ. Notice that I am enumerating (counting) these possibilities but this is not always the case as we will see now. It is just for you to understand that

Under a specified random variable I have n possible values that it can randomly take

Types of Random Variables

There are two types of random variables: discrete and continuous.

06_types-of-rv.png Types of random variables

Be aware that the event by itself does not determine which type of variable we are dealing with, but our main interest in this event. As an example, let's say we want to know how much time it takes to get to university. If we want to know the EXACT time, we are dealing with a continuous variable as we can not have sure about it. I can take 12.31018793220 minutes or as many decimals you want to add here. It is not countable.

On the other hand, if I want to know the amount of time rounded so I can have just the minutes, it is possible for me to count the values: 12, 13, 14 minutes ... and so on (I can count !). In this case, I am dealing with a discrete variable.

Probability

When talking about random variables we are also talking about probabilities as they are directly correlated. I mean, all we want to do with random variables is finding the probabilities for certain values to be taken. Or rather, when analyzing an event, all we want to know is how likely are the outcomes to happen. The syntax for probabilities of a random variable X is P(X).

For discrete variables, we can find the probability of X taking a value x₁. So, P(X = x₁) can be defined! The probabilities can be taken from a frequency chart (histogram).

06_prob-freq-rv.pngHistogram for discrete variable X

Conversely, we can not find the probability of a continuous variable taking a value as there is no such thing as an exact value for this type of variable. As by its definition, we can analyze instead a range or interval of values. So, if we have a continuous variable Y, knowing P(Y = y₁) is not possible. How can we know the probability of raining exactly 1.00001 mm today? Or 1.00000000000001589 mm? Or 1.06554979899984098090 mm? Ok, I think you got it 😁 But it is possible to know P(y₁ < Y < y₂)! And it actually makes more sense.

For this, we just need to calculate the area under the plotted curve of probabilities. These probabilities are also taken by a histogram in which the values are plotted based on intervals/bins (these intervals can be as small as we want). Then a line is plotted circumscribing the histogram bins.

06_prob-freq-rv-continuous.pngContinuous variable and its probability curve)

06_prob-freq-rv-continuous2.pngThe probability of P(y₁ < Y < y₂) is this hatched area

This is just the beginning of the beginning of probability but as any base knowledge, it is important to know. Hooray! o/


Thanks for reading 😊☕️
Any suggestions, questions, comments and critiques are highly welcome! Even if it is just to compliment my drawings. I know, they are awesome.