A Universe of Sorts

§ Linearity of expectation for sampling

# process 1
def val(c): return 1 + ord(c) - 'a'
def process(addx, addy):
    s = 0
    ndraws = 10
    for _ in range(ndraws):
        x = random.choice("abcde"),  # draw a random chit
        y = random.choice(x*5+"abcde") # draw a random chit, dependent on first random chit.
        if addx: s += val(x)
        if addy: x += val(y)
    return s

Linearity of expectation says that process(True, True) equals process(True, False) + process(False, True).
Intuitively, if we run the code for process infinitely many times, then each execution of process(True, True)can be split into an execution of process(True, False) and an execution of process(False, True). This can be depicted as:

The above process assumes that we get the same results from running process(True, True) as we do when we run process(True, False) and process(False, True) in succession. Of course, this will never happen.
However, since we are taking an average over many trials, we can imagine that for a run of process(True, True), we will have corresponding runs of process(True, False) and process(False, True):

The key thing to remember is that the random variable does not care about the process, only about the value that is spit out by the process.
Thus we can read linearity of expectation as saying either (1) Simulate the full process in one go and accumulate the results as you go along process(True, True) or E[a+b], or (2) Simulate the process in parts, and add up the accumulated results from the partial simulations ( process(True, False) + process(False, True) or E[a] + E[b]).
In both cases , we are allowed to simulate the process fully! The only thing that differs is when we accumulate the answers.
This is in contrast to computing conditional probability, where the situation/process in which P(A|B) occurs is wildly different from P(A).
Linearity of expectation asks us to run the same process, just tally results differently.
It tells us that randomness allows the tallies to line up, whether we tally in two separate phases or in a single phase, which makes intuitive sense!

§ Linearity of expectation is purity

Suppose we write:

x = random.choice("abcde")
y = random.choice("abcde")
s =  val(x) + val(y)

If we ask for the expected value of s. It's going to be:

E[s] = E[val(x) + val(y)]
= E[val(x)] + E[val(y)]
= 2 E[val(x)]

The last inequality follows because x and y are two copies of the same random variable random.choice("abcde"), thus have the same expected value for val(x), val(y).
So, expecatation 'purifies' random computations.

§ "Deriving" equivalence for two processses using purity

First write down what process(True, False) + process(False, True) as:

def rhsI():
    sx = 0; sy = 0
    ndraws = 10
    for _ in range(ndraws):
        x = random.choice("abcde"),  # draw a random chit
        y = random.choice(x1*5+"abcde") # draw a random chit, dependent on first random chit.
        sx += val(x)

    for _ in range(ndraws):
        x = random.choice("abcde"),  # draw a random chit
        y = random.choice(x2*5+"abcde") # draw a random chit, dependent on first random chit.
        sy += val(y)
    return sx + sy

Next, we use the purity of random.choice (within expectation) to fuse the two loops:

def rhsII():
    sx = 0; sy = 0
    ndraws = 10

    # loop fusion is safe, because even though random.choice has a side effect, the order
    # of calling random.choice does not matter. It commutes with other random ops.
    for _ in range(ndraws):
        x1 = random.choice("abcde"),  # draw a random chit
        y1 = random.choice(x1*5+"abcde") # draw a random chit, dependent on first random chit.
        sx += val(x1)
        # loop fusion
        x2 = random.choice("abcde"),  # draw a random chit
        y2 = random.choice(x2*5+"abcde") # draw a random chit, dependent on first random chit.
        sy += val(y2)
    return sx + sy

Next, we use purity to set x1 = x2 and y1 = y2, since on expectation, their values are the same.

def rhsIII():
    sx = 0; sy = 0
    ndraws = 10

    # once again, expectation purifies randomness. So within the context of expecattion, we can
    # replace `x2` with `x1` with `x1`
    for _ in range(ndraws):
        x1 = random.choice("abcde"),  # draw a random chit
        y1 = random.choice(x1*5+"abcde") # draw a random chit, dependent on first random chit.
        sx += val(x1)
        # loop fusion
        x2 = x1
        y2 = y1
        sy += val(y2)
    return sx + sy

Finally, we cleanup the code to arrive at process(True, True):

def rhsIV():
    sx = 0; sy = 0
    ndraws = 10

    # once again, expectation purifies randomness. So within the context of expecattion, we can
    # replace `x2` with `x1` with `x1`
    for _ in range(ndraws):
        x1 = random.choice("abcde"),  # draw a random chit
        y1 = random.choice(x1*5+"abcde") # draw a random chit, dependent on first random chit.
        sx += val(x1)
        sy += val(y1)
    return sx + sy

§ Linearity of expectation for sampling

§ Linearity of expectation is purity

§ "Deriving" equivalence for two processses using purity

§ For N=1N=1N=1, the expected number of turns is 111.

§ For $N=1$ , the expected number of turns is $1$ .