A Universe of Sorts

§ Bounding chains: uniformly sample colorings

We wish to uniformly sample k colorings of a graph

G

with maximum degree

\Delta

. Hence, we require

k \geq \Delta + 1

. To perform this sampling, we use MCMC to sample from a markov chain whose states are

k

-colorings of

G

, whose stationary distribution is the uniform distribution over valid colorings.
The issue with MCMC techniques is that we never know when our chain has reached the stationary state. To ensure we receive uniformly distributed samples, we built a "bounding chain"

W

which has the following properties:

States of $W$ cover states of $X$ [ie, state space of $W$ are subsets of the state space of $X$ ].
$W$ 's convergence to a stationary state can be checked.
Upon convergence of $W$ , state of $W$ = state of $X$ .

We will in fact run the

W

chain, and prove that running the

W

chain is equivalent to running a 'shadow' of the

X

chain, and that stationary states of the

W

chain correspond to stationary sates of the

X

chain.
Let

X

be the chain whose states are valid

k

colorings of

G

. In one step of the chain

X

, we choose a vertex

v

uniformly at random; we then choose a color

c'_v

uniformly at random for

v

that makes it a proper coloring. The vertex

v

is changed to this new color

c'

. This is a symmetric proposal distribution, Hence this chain has the uniform distribution over

k

-colorings to be the stationary state.
Sampling exactly from this chain is hard: construct an initial state

X_0

amounts to finding some valid

k

coloring which in itself might be challenging. Worse, we do not know whether the chain

X

has reached a stationary state or not.

§ Bounding Chain

We construct a new chain

W

(the bounding chain of

X

), whose states are sets of colors for vertices in

G

. Formally, the states of

W

are functions

Vert(G) \rightarrow 2^C

where

Vert(G)

denotes the vertices of

G

;

C

the set of colors. The transition will be to pick a vertex

v

uniformly at random. Then, pick a new set of legal colors

C'_v

for

v

, such that:

It is guaranteed that if $X$ were transitioning on $v$ , the color $c'_v$ that would be picked by $X$ for $v$ is a member of $C'_v$ . [state is covered ]
The size of the set $C'_v$ attempts to be smaller than the current set of colorings $C_v$ . [convergence ]

We describe the transition function next. But first, we need an alternate lens on the transitions of

X

that is amenable to massaging.

§ Equivalent description of the transitions of $X$ :

Choosing a color uniformly at random from the set of valid colors for a vertex.
Choosing colors from $C$ without replacement until we get a color that is a valid color.

We claim that (1) and (2) have the same probability distribution.
Abstracting slightly, we state:

Probability of choosing an element $t \in T$ uniformly from $T \subseteq S$ . This has probability $1/|T|$ .
Probability of choosing a particular element $t \in T$ , by picking elements from $S$ without replacement until we get some element in $T$ .

(1) and (2) have the same probability distribution.

§ Proof by induction:

Process (2) in code is the following:

def process(S, T):
  assert(issubset(T, S)) # precondition
  s = np.random.choice(S) # uniform choice over S.
  if s in T: return s # |T|/|S| prob. to enter `if`.
  else:
    # (1 - |T|/|S|) to enter `else`
    Snext = S.remove(s); return process(Snext, T)

We claim that the probability that process(S, T) = t0 for a fixed t0 in T is

1/|T|

. We create a new function indicator to express this:

def indicator(t0, S, T):
  assert(t0 in T) # precondition
  assert(issubset(T, S)) # precondition

  return t0 == process(S, T)

Let's push in t0 == into the definiton of process after inling process. This gives us:

def indicator(t0, S, T):
  assert(t0 in T) # precondition
  assert(issubset(T, S)) # precondition

  s = np.random.choice(S)
  if s in T:
    # T/S prob. for x in T
    return t0 == s # 1/|T| prob. for t0 == x
  else:
    #  (1 - |T|/|S|) to reach else branch.
    Snext = S.remove(s); return process(Snext, T)

Now, we write down the recurrence for the probability that we are trying to compute:

P(t0, S, T)

is the probability that indicator(t0, S, T) returns True. Alternatively, it's the probability that process(S, T) returns t0.

P(t0, S, T) = prob. that process(S, T) = t0
P(t0, T, T) = T/|T| * 1/|T| = 1/|T| [base case]
P(t0, S, T) = 1/|S| + (1 - |T|/|S|) * P(t0, |S|-1, T) [induction]

We assume for induction that P(t0, |S|-1, T) = 1/|T|. On substitution into [induction], we get:

P(t0, S, T)
= 1/|S| + (1 - |T|/|S|) * P(t0, |S|-1, T) [induction]
= 1/|S| + (1 - |T|/|S|) * 1/|T|
= 1/|S| + (1/|T| - 1/|S|)s
= 1/|T|

Which is indeed the same probability as (1):

1. Choosing an element uniformly from a subset $T$ = 1/|T|.

§ Proof by program analysis, Version 1

def process(S, T):
  s = np.random.choice(S)
  if s in T: return s
  else return process(S.remove(s), T)

Notice that the last return is tail-call. This program can be rewritten as:

def process(S, T):
  while True:
    s = np.random.choice(S)
    if s in T: return s
    S = S.remove(s)

As previously, we create an indicator function and study its behaviour:

def indicator(t0, S, T):
  assert(t0 in T) # precondition
  assert(issubset(T, S)) # precondition

  while True:
    s = np.random.choice(S)
    if s in T: return t0 == s
    S = S.remove(s)

We know that this programs only returns a value from the line:

if s in T: return t0 == s

We now compute P(process(S, T) == t0). Whatever the return value of indicator, we can assume that it occured within the if condition. We can use this to compute:

P(indicator(t0, S, T) = True)
 = P(t0 == s | s in T) [program only returns in this case]
 = 1/|T|

§ Proof by program analysis, Version 2

Alternatively, we can also analyze this as we did in the first proof, using the rule:

def f():
  return if cond1 then cond2 else cond3

will have probability:

P(cond1) * P(cond2|cond1) + P(not cond1) * P(cond3|not cond1)

Using this, we can analyze indicator as:

P(indicator(t0, S, T) = True)
 = P(s in T) * P(t0 == s |s in T) +
    P(s not in T) * P(indicator(t0, S.remove(s), T) | s not in T)
 = |T|/|S| * 1/|T| +
    (|S|-|T|)/|S| * P(indicator(t0, S.remove(s), T))
 = 1/|S| + (|S|-|T|)/|S| * 1/|T| [by induction]
 = 1/|T|

§ Sampling using the above definition

Recall that

State(X) \equiv V \rightarrow C

State(W) \equiv V \rightarrow 2^C

. Let

x[~], w[~]

be the states of some run in the markov chains.
We start by having

x[0]

to be any valid k-coloring of

G

, and

w[0]

to be the state where all vertices have all possible colors;

w[0] \equiv \_ \mapsto C

. Clearly,

x[0] \in w[0]

.
By induction we assume that

x[n-1] \in w[n-1]

. We must now calculate a

w[n], x[n]

such that (1)

x[n] \in w[n]

, (2)

x[n]

's proposal is symmetric. (3)

w[n]

's proposal is symmetric.

§ Occluded set $O$

Define

O \subseteq C

(for occluded) be the set colors that might possibly be blocked for

v

from our view of w. Note that this is an over-approxmation : that is, there may be colors that are not blocked for v, which we believe to be blocked from w's point of view.

O \equiv \{ c \in C : (v, \alpha) \in E, c \in w[n-1](\alpha) \}

§ Allowed set $A$

Define

A \subset C

(for allowed) to be

C - O

. Note that

A

is an under-approxmation , since O was an over-approximation . That is:

Any color in A is definitely a legal color for v in x[n].
There are colors which are legal for v in x[n] that is not in A.

§ S: the sequence of states for transition

Now, we pick elements of

C

in sequence till we get an element of A. call this sequence

S

.
We will at worst pick

\Delta + 1

elements for

S

, since the max-degree of the graph is

\Delta

§ Transition

Let

i

be the first index in

S

where we get a color that is truly legal for

v

x[n]

. Note that such an index will always exist: We pick elements into

S

till we get an element in

A

, and elements of

A

are always legal. However, there can be elements which are not in

A

that are still legal for

v

x[n]

, since

A

is an under-approximation.

We assign $x[n](v) = i$ . So, x only cares about S[:i].
We assign $w[n](v) = A$ . So, W cares about the entire sequence.

By the lemma proven, we know that this process of picking colors C in a sequence till we get a color that is legal for

v

at index

i

is the same as picking uniformly at random from the set of colors that are legal for

v

§ An example

For example, we could have:

X | p:2 --- q:4
W | p:{1, 2} --- q:{4, 5}

we are sampling q:

O = {1, 2}
A = {3, 4, 5}
S = [2, 1, 3]

X | p:1 -- q:2
W | p:{1, 2} -- q:{1, 2, 3}

If we analyze S = [2, 1, 3] we notice that:

2: invalid for W(p:[1, 2]), invalid for X(p:2)
1: invalid for W, valid for X
3: valid for W, valid for X

So, what we are really sampling ix:

A prefix sequence SX = [1] (for Xs transition)
A leftover sequence SW = [2, 3] (for Ws transition)

To transition X, we can safely drop SW. However, to transition W correctly, we generate more elements in the sequence, till we hit a "safe" element.

§ An optimisation on picking colors: why we need a blocking set

Define the set

B \subseteq C

(for blocked) which governs which values

x[n]

surely cannot take from our view of

w[n-1]

. Note that

B

is an under-approximation . v might have more colors that are blocked than what w sees.

B \equiv \{ c \in C : (v, \alpha) \in E, w[n-1](\alpha) = \{c\} \}

Rather than sampling colors from C till we get an element of A, we can sample colors from C/B. We know that the colors in B can never be used by

X

, since the colors in B are those that we know are blocked for sure .
This is used in the theoretical analysis of the paper.

§ Termination

We terminate when

W

has "trapped"

X

. That is,

|w[n](v)| = 1

forall

v \in V

. In such a case, the states of

W

is equal to states of

X

. This is a coalescence (as it occurs in coupling from the past). From the coupling from the past theory, we know that we have reached a stationary state of

A

when this happens.

§ Pseudocode

-- | colors, vertices, edges
cs :: [C]; cs = ...
vs :: [V]; vs = ...
es :: [(V, V); es = ...


-- | definitely not allowed
blocked :: (V -> [C]) -> (V -> [C])
blocked v2cs v0 = concat [v2cs w | (v, w) <- es, v == v0, length (v2cs w) == 1]


-- | maybe not allowed
occluded :: (V -> [C]) -> (V -> [C])
occluded v2cs v0 = concat [v2cs w | (v, w) <- es, v == v0]

-- | definitely allowed from Ws point of view
allowed :: (V -> [C]) -> (V -> [C])
allowed v2cs = cs \\ occluded v2cs

-- | perturb the color function to another function
perturb :: (V -> [C]) -> Rand (V -> [C])
perturb v2cs = do $
  randv <- uniform_rand vs
  randc_for_v <- uniform_rand (allowed v2cs v)
  return $ \v -> if v == randv then randc_for_v else v2cs v


-- | check if we are done
terminated :: (V -> [C]) -> Bool
terminated v2cs = all [length (v2cs v) == 1 | v <- vs]

-- | generate chain
chain :: (V -> [C]) -> Rand [V -> [C]]
chain f = do
  if terminated f
  then [] else do
    f;' <- perturb f; fs <- chain f'; return (f:fs)

-- | return a sample
sample :: (V -> [C]) -> Rand (V -> [C])
sample = last . chain

§ References

Exact Sampling and Approximate Counting Techniques