§ Elementary probability theory (TODO)


I've never learnt elementary probability theory "correctly". This is me attempting to fix it.

§ Defn: Sample space


set of all possible outcomes / things that could happen.

§ Defn: Outcome / Sample point / atomic event


An outcome consists of all the information about the experiment after it has been performed including the values of all random choices.

§ NOTE: Keeping straight event v/s outcome


It's easy to get confused between 'event' and 'outcome' (linguistically). I personally remember that one of them is the element of the sample space and another the subsets, but I can't remember which is which. Here's how I recall which is which:
every experiment has an outcome . We write an outcome section when we
write a lab manual/lab record for a given experiment.

Now, we when perform an expriment, or something random happens, sometimes, the result (ie, the outcome) can be eventful ; it's not linguistically right to say that some events can be outcomeful .
So, an event is a predicate over the set of outcomes; event: outcome -> bool. This is the same as being a subset of outcomes (the event is identified with the set of outcomes it considers eventful), so we have event ~= 2^outcomes.

§ Example: Monty hall


An outcome of the monty hall game when the the contestant switches consists of:
Once we know the three things, we know everything that happened.
For example, the sample point (2,1,3)(2, 1, 3):
Note that not all 3-tuples correspond to sample points. For example,

§ Constructing the sample space: tree method


We build a decision tree.

§ where is the prize?


(prize 1)
(prize 2)
(prize 3)

§ player's choice


(prize 1
   (choice 1)
   (choice 2)
   (choice 3))
(prize 2
   (choice 1)
   (choice 2)
   (choice 3))
(prize 3
   (choice 1)
   (choice 2)
   (choice 3))


§ Which box is revealed


(prize 1
   (choice 1
      (reveal 2)
      (reveal 3))
   (choice 2
      (reveal 3))
   (choice 3)
      (reveal 2))
(prize 2
   (choice 1
     (reveal 3)
   (choice 2
     (reveal 1)
      (reveal 3))
   (choice 3)
     (reveal 1))
(prize 3
   (choice 1
     (reveal 2))
   (choice 2
     (reveal 1))
   (choice 3)
     (reveal 1)
     (reveal 2))

§ Win/Loss


(prize 1
   (choice 1
loss  (reveal 2)
loss  (reveal 3))
   (choice 2
win   (reveal 3))
   (choice 3)
win   (reveal 2))
(prize 2
   (choice 1
win  (reveal 3)
   (choice 2
loss (reveal 1)
loss  (reveal 3))
   (choice 3)
win  (reveal 1))
(prize 3
   (choice 1
win  (reveal 2))
   (choice 2
win  (reveal 1))
   (choice 3)
loss (reveal 1)
loss (reveal 2))

This seems like it's 50/50! But what we're missing is the likelihood of an outcome.

§ Defn: Probability space


A probability space consists of a sample space (space of al outcomes) and a probability function PP that maps the sample space to the real numbers, such that:
Interpretation : For every outcome, the P(outcome)P(outcome) is the probability of that outcome happening in an experiment.

§ Assumptions for monty hall



§ Assigning probabilities to each edge


(prize 1 [1/3]
   (choice 1 [1/3]
l  (reveal 2)   [1/2]
l  (reveal 3))  [1/2]
   (choice 2 [1/3]
w   (reveal 3)) [1]
   (choice 3) [1/3]
w   (reveal 2)) [1]
(prize 2 [1/3]
   (choice 1
w  (reveal 3)
   (choice 2
l (reveal 1)
l  (reveal 3))
   (choice 3)
w  (reveal 1))
(prize 3  [1/3]
   (choice 1
w  (reveal 2))
   (choice 2
w  (reveal 1))
   (choice 3)
l (reveal 1)
l (reveal 2))

§ Assigning probabilities to each outcome



(prize 1 [1/3]
   (choice 1 [1/3]
l  (reveal 2)   [1/2]: 1/18
l  (reveal 3))  [1/2]: 1/18
   (choice 2 [1/3]
w   (reveal 3)) [1]: 1/9
   (choice 3) [1/3]
w   (reveal 2)) [1]: 1/9
...

So the probability of winning is going to be 6×1/9=236 \times 1/9 = \frac{2}{3}.

§ Defn: Event


An event is a subset of the sample space.

§ Probability of an event


The probability that an event EE occurs is the sum of the probabilities of the sample points of the event: P(E)eEP(e)P(E) \equiv \sum_{e \in E} P(e).

§ What about staying?


I win 2/32/3rds of the time when I switch . If I don't switch, I must have lost. So if I choose to stay, then I lose 2/32/3rds of the time. We're using that

§ Gambing game



\ 2  /
 \  /
6 \/ 7
  ||
it's the same on the reverse side. It's a fair dice. So the probability of getting 22 is a third. Similarly for 6,76, 7.


§ Analysis: Dice A v/s Dice C



(2
  (3
   4
   8))
(6
  (3
   4
   8))
(7
  (3
   4
   8))


(2
  (3    C
   4    C
   8))  C
(6
  (3    A
   4    A
   8))  C
(7
  (3    A
   4    A
   8))  C

Each of the outcomes has a probability 1/91/9, so dice CC wins.

§ Lecture 19: Conditional probability


P(A|B) where both A and B are events, read as probability of A given B.
P(AB)P(AB)P(B) P(A|B) \equiv \frac{P(A \cap B)}{P(B)}

We know BB happens so we normalize by BB. We then intersect AA with BB because we want both AA and BB to have happened, so we consider all outcomes that both AA and BB consider eventful, and then reweigh the probability such that our definition of "all possible outcomes" is simply "outcomes in BB".

§ Product Rule


P(AB)=P(B)P(AB) P(A \cap B) = P(B) P(A|B)

follows from the definition by rearranging.

§ General Product Rule


P(A1A2An)=P(A1)P(A2A1)P(A3A2A1)P(A4A3A2A1)P(AnA1An1) P(A_1 \cap A_2 \dots A_n) = P(A_1) P(A_2 | A_1) P(A_3 | A_2 \cap A_1) P(A_4 | A_3 \cap A_2 \cap A_1) \dots P(A_n | A_1 \cap \dots \cap A_{n-1})

§ Example 1:


In a best two out of three series, the probability of winning the first game is 1/21/2. The probability of winning a game immediately after a victory is 2/32/3. Probability of winning after a loss is 1/31/3. What is the probability of winning given that we win in the first game?
Tree method:
(W1
  (W2)
   (L2
      (W3
      L3)))
(L1
  (W2)
   (L2
      (W3
      L3)))
(L1)

The product rule sneakily uses conditional probability! P(W1W2)=P(W1)P(W2W1)P(W_1W_2) = P(W_1) P(W_2|W_1). Etc, solve the problem.

§ Definition: Independence


events AA, BB are independent if P(AB)=P(A)P(A|B) = P(A) or P(B)=0P(B) = 0.

§ Disjointness and independence


Disjoint events are never independent, because P(AB)=0P(A|B) = 0 while P(A)P(A) need not be zero.

§ What do indepdent events look like?


We know that we need P(AB)=P(A)P(A|B) = P(A). We know that P(AB)P(A|B) is how much of AA is within BB. So we will have P(AB)=P(A)P(A|B) = P(A) if the space that AA occupies in the sample space is the same proprtion of AA that occupies BB. Euphimistically, A/S=(AB)/BA/S = (A \cap B)/B.

§ Independence and intersection


If AA is independent of BB then P(AB)=P(A)P(B)P(A\cap B) = P(A) P(B).
P(A)=P(AB)(given)P(A)=P(AB)/P(B)(defn of computing P(AB))P(A)P(B)=P(AB)(rearrange) \begin{aligned} P(A) = P(A|B) \text{(given)} \\ P(A) = P(A \cap B) / P(B) \text{(defn of computing $P(A|B)$)} \\ P(A) P(B) = P(A \cap B) \text{(rearrange)} \\ \end{aligned}

§ Are these two independent?



Intuitively it seems that these should be dependent because knowing something about the first coin should tells us if the coins match. P(AB)P(A|B) is the probability that (second coin is heads) which is 1/21/2. P(A)=1/2P(A) = 1/2.
But our intuition tells us that these should be different!

§ Be suspect! Try general coins


Let prob. of heads is pp and tails is (1p)(1-p) for both coins.
P(AB)=pP(A|B) = p, while P(A)=p2+(1p)2P(A) = p^2 + (1-p)^2.

§ Mutual independence


Events A1,A2,AnA_1, A_2, \dots A_n are mutually independent if any knowledge about any of the rest of the events tells us anything about the iith event.

§ Random variables


A random variable RR is a function from the sample space SS to R\mathbb R. We can create equivalence classes of the fibers of RR. Each of this is an event, since it's a subset of the sample space. Thus, P(R=x)P(R = x) = P(R1(x))=w:R(w)=xP(w)P(R^{-1}(x)) = \sum_{w: R(w) = x} P(w)

§ Independence of random variables


x1,x2R,P(R1=x1R2=x2)=P(R1=x1) \forall x_1, x_2 \in \mathbb R, P(R_1 = x_1 | R_2 = x_2) = P(R_1 = x_1)

Slogan: No value of R2R_2 can influence any value of R1R_1.

§ Equivalent definition of independence:


P(R1=x1R2=x2)=P(R1=x1)P(R2=x2) P(R_1 = x_1 \land R_2 = x_2) = P(R_1 = x_1) P(R_2 = x_2)

§ References