# Random Variables

# Random what?

For me, randomness was about something that we could not predict that it was going to happen at all. And when I put it together with the concept of *variable* that I consider to be an element that can take any possible value under minimal circumstances ... I just could not understand at first what those statistics texts were referring to when mentioning *random variable*. Is it like any possible value to any random situation? Wait... **what?**

But thankfully now I do know what they mean. **HOORAY** (if you have ever watched StatQuest you will get this expression. This channel is like a statistical bible in the form of videos. And I am totally taking this expression throughout the rest of my life (and posts). ☺️

# Random Variables

A random variable is just a variable whose value we don't know. It does not mean that we don't know the values (or the range of values) it can take (in fact we do!) And for each value (*or range*) we can associate a probability for it to happen.

The random variable is actually the set of outcomes of the event(situation/experiment) we are interested in. Hence, knowing the event, we can choose what outcomes we want to analyze. Here we go with some examples:

Event | Outcome we are interested in (Random Variable) |

Flip a coin 5 times | number of heads |

Powerlifting Championship | contestants scores |

Tossing two dices | sum of the dices scores |

Exercising weekly | how many days per week I exercise |

A random variable is represented with a capital letter such as *X* and it has the following possible values: *x₁, x₂, x₃, ..., xₙ*. Notice that I am enumerating (counting) these possibilities but this is not always the case as we will see now. It is just for you to understand that

Under a specified random variable I have

npossible values that it can randomly take

## Types of Random Variables

There are two types of random variables: **discrete** and **continuous**.

*Types of random variables*

Be aware that the event by itself does not determine which type of variable we are dealing with, but our main interest in this event. As an example, let's say we want to know how much time it takes to get to university. If we want to know the **EXACT** time, we are dealing with a *continuous variable* as we can not have sure about it. I can take 12.31018793220 minutes or as many decimals you want to add here. It is not *countable*.

On the other hand, if I want to know the amount of time rounded so I can have just the minutes, it is possible for me to count the values: 12, 13, 14 minutes ... and so on (I can *count* !). In this case, I am dealing with a *discrete variable*.

## Probability

When talking about random variables we are also talking about probabilities as they are directly correlated. I mean, all we want to do with random variables is finding the probabilities for certain values to be taken. Or rather, when analyzing an event, all we want to know is how likely are the outcomes to happen. The syntax for probabilities of a random variable *X* is *P(X)*.

For discrete variables, we can find the probability of X taking a value x₁. So, *P(X = x₁) * can be defined! The probabilities can be taken from a frequency chart (histogram).

*Histogram for discrete variable X*

Conversely, we can not find the probability of a continuous variable taking a value as there is no such thing as an exact value for this type of variable. As by its definition, we can analyze instead a **range or interval** of values. So, if we have a continuous variable Y, knowing P(Y = y₁) is not possible. How can we know the probability of raining exactly 1.00001 mm today? Or 1.00000000000001589 mm? Or 1.06554979899984098090 mm? Ok, I think you got it 😁 But it is possible to know *P(y₁ < Y < y₂)*! And it actually makes more sense.

For this, we just need to calculate the area under the plotted curve of probabilities. These probabilities are also taken by a histogram in which the values are plotted based on intervals/bins (these intervals can be as small as we want). Then a line is plotted circumscribing the histogram bins.

*Continuous variable and its probability curve)*

*The probability of P(y₁ < Y < y₂) is this hatched area*

This is just the beginning of the beginning of probability but as any base knowledge, it is important to know. Hooray! o/

## No Comments Yet