A Universe of Sorts

§ Interpolants: Vibes

References: Applications of craig interpolants and model checking
We shall explain what an interpolant in two ways.

§ Implication based definition

Let $A(p, q)$ and $B(q, r)$ be propositional formulae (where $p, q$ are vectors of variables) such that $A \implies B$ .
Then, there exists a formula $C(q)$ such that $A(p, q) \implies C(q)$ and $C(q) \implies B(q, r)$ .
So, the "crux" of why $A$ implies $B$ lies in $C$ .
One can ask: why is this useful? and how does one compute such an object? and why does sucn an object even exist?

§ Existence proof for interpolant

Choose a particular $A(p, q)$ and $B(q, r)$ such that $A(p, q) \implies B(q, r)$ , We are trying to build an interpolant $C(q)$ .
For example, suppose $a, b, c$ are variables, and the assignment $M(a, b, c)$ be $a \mapsto 1$ , $b \mapsto 0$ , and $c \mapsto 1$ . ( $M$ for model).
Let $F(M(a, b c)))$ be a formula that is true when the values of variables in $a$ is equal to $M(a, b, c)$ .
Then $F(A(a, b, c)) \equiv a \land \lnot b \land c$ .
That is, it's the formula that's true when the values of $a, b, c$ match the assignment $M(a, b, c)$ .
Now, we will try to build an interpolant $C(q)$ .
TODO: this needs a bit more thought.

§ Contradiction of Conjunction Viewpoint

Let $A(p, q) \land B(q, r)$ be unsatisfiable.
Then, interpolation implies that there exists a formula $C(q)$ , such that $A(p, q) \implies C(q)$ and $C(q) \land B(q)$ is UNSAT.
This tells us that "unsat core" of $A$ and $B$ actually lives in $C$ .

§ Proof that implication interpolant implies conjunction unsat interpolant

We can phrase that $A(p, q) \land B(q, r)$ being UNSAT as saying that $\lnot (\lnot A \lor \lnot B)$ in UNSAT, or that $\lnot A \lor \lnot B$ as being TAUTO, or that $A \implies \lnot B$ .
We can apply interpolation to tell us that there is a formula such that $A(p, q) \implies C(q)$ such that $C(q) \implies \lnot B$ .
Reversing the logic, we can see that $C(q) \land B(q, r)$ must be UNSAT.
Roughly, we use the fact that $A \implies B$ is a tautology implies that whenever $A$ is true, $B$ must be true. This means that $A \implies \lnot B$ is UNSAT, which is the same as $\lnot A \lor \lnot B$ being unsat, which is the same as $\lnot (A \land B)$ being UNSAT.
I would appreciate a more direct way of seeing this fact!

§ Interpolant as telling us which clause is refuting

From the UNSAT perspective, we know that $(A(p, q) \land B(q, r))$ is UNSAT.
We can think of the interpolant $C(q)$ as telling us why the unsat happened.
Recall that $C(q)$ is such that (i) $A(p, q) \implies C(q)$ and (ii) $C(q) \land B(q)$ is UNSAT.
(i) implies $\lnot C(q) \implies \lnot A(p, q)$ .
(ii) implies $\lnot (C(q) \land B(q))$ is a tautology, or $\lnot C(q) \lor \lnot B(q)$ , or $C(q) \implies \lnot B(q)$ .
So, consider a model $M$ . We know that $M \lnot \models A \land B$ .
Let's now test what value $C(q)$ ought to have given the two above constraints.

§ Case 1: contradiction from clause in $A$

Suppose the contradiction arose from a clause in $A$ .
We must have $M \models A(p, q) \implies C(q)$ . This imposes no conditions on $A(p, q)$ since $A(p, q)$ is false, and thus the formula will be true regardless of $C(q)$ .
We must also have $M \lnot \models C(q) \land B(q, r)$ .
For this to happen regardless of $B(q, r)$ , we must have $C(q) = 0$ in $M$ .
See that this also agrees with $M \models \lnot C(q) \implies \lnot A(p, q)$ . So we can think of this as telling us to "label" $M$ with $C(q) = 0$ when the failure in $M \lnot \models A \land B$ arises from $A$ .

§ Case 2: contradiction from clause in $B$ .

Suppose the contradiction arose from a clause in $B$ .
We must have $M \models A(p, q) \implies C(q)$ . For this to be true regardless of the truth value of $\llbracket A(p, q) \rrbracket_M$ , we must choose $C(q) = 1$ .
We must also have $M \lnot \models C(q) \land B(q, r)$ . This will hold because $\llbracket B(q, r) \rrbracket_M = 0$ , regardless of $C(q)$ .
See that this also agrees with $M \models C(q) \implies \lnot B(p, q)$ . So we can think of this as telling us to "label" $M$ with $C(q) = 1$ when the failure in $M \lnot \models A \land B$ arises from $B$ .

§ Recipe for exponential sized $C$

Combining the two above, we can see that $C$ is really a labelling function, that labels for an UNSAT clause $A \land B$ , and for an arbitrary model $M$ , whether $M \lnot \models A \land B$ because of $A$ or because of $B$ .
If it's because of $A$ , then $\llbracket C \rrbracket = F$ , and otherwise it's because of $B$ , so $\llbracket C \rrbracket_M = T$ .
This gives us a recipe to write a $C$ down.
Create a clause of the form $\forall m \in M, (A(M) = 0 \implies F) \land (B(M) = 0 \implies T)$ .
This may not look propositional since we are quantifying over models.
However, the set of models is finite, so we can build a gigantic (exponential in the number of variables) disjunct over all models, and create a huge clause.
For example, if we want to know that we are in a model $[x_1 \mapsto T, x_2 \mapsto F]$ , we can literally write the expression $x_1 = T \land x_2 = F$ , or in more shorthand, $x_1 \land \lnot x_2$ .
This gives us an algorithm to compute an exponential sized interpolant, so we now know that interpolants always exist!

§ Linear time algorithm for computing interpolants

We now want an algorithm that gives us smaller interpolants.
Note that this algorithm is linear in the size of the refutation proof , which could be exponentially large!
The idea is to use the fact that the interpolant tells us which clause is refuting.
So, recall from above that we have that (i) $\lnot C(q) \implies \lnot A(p, q)$ , and that (ii) $C(q) \implies \lnot B(q, r)$ .
Next, see that in a resolution proof, we can walk a resolution proof "backwards" to figure out which clause gave rise to a contradiction.
We will build a $C(q)$ by performing induction on the resolution proof, maintaining (i) and (ii) as invariants of the clauses we have seen so far.

§ Using clause from $A$

If we are using a clause $A_i$ of $A$ , then the associated term for the interpolant will be $I(A_i) \equiv 0$ . The idea is that if the UNSAT for a particular model $M$ came from $A_i$ , then we must have that the interpolant $I(A_i)$ has value $0$ from the previous analysis. In this case, we know for sure that the UNSAT came from $A_i$ , and thus the interpolant is just $0$ .

§ Using clause from $B$

By the exact same argument, if we use clause $B_i$ of $B$ , then the associated term for the interpolant will be $I(B_i) \equiv 1$ .

§ Resolving, both from $A$ .

Suppose we resolve two clauses $A_i, A_j$ Now, if the contradiction came from either $A_i$ or $A_j$ , then this part of the resolution is the cause of UNSAT.
Now see that $I(Resolv(A_i, A_j))$ should be $0$ if the UNSAT came from $A$ , and $I(Resolve(A_i, A_j))$ should be $1$ if the UNSAT came from $B$ .
But if this part of the resolution is the cause of UNSAT, then we will have either $I(A_i) = 0$ or $I(A_j) = 0$ . In either of theses cases, the UNSAT came from $A$ .
So, we need $I(Resolv(A_i, A_j)) = 0$ . To achieve this, we set $I(Resolv(A_i, A_j)) = I(A_i) \land I(A_j)$ .
Or, in less terse terms, "if $I(A_i)$ or $I(A_j)$ is $0$ , then so is $I(Resolv(A_i, A_j))$ ". Convert that to a formula.

§ Resolving, both from $B$ .

Suppose we resolve two clauses $B_i, B_j$ Now, if the contradiction came from either $B_i$ or $B_j$ , then this part of the resolution is the cause of UNSAT.
Now see that $I(Resolv(B_i, B_j))$ should be $0$ if the UNSAT came from $A$ , and $I(Resolve(B_i, B_j))$ should be $1$ if the UNSAT came from $B$ .
But if this part of the resolution is the cause of UNSAT, then we will have either $I(B_i) = 0$ or $I(B_j) = 0$ . In either of theses cases, the UNSAT came from $B$ .
So, we need $I(Resolv(B_i, B_j)) = 1$ . To achieve this, we set $I(Resolv(A_i, A_j)) = I(B_i) \lor I(B_j)$ .
Or, in less terse terms, "if $I(B_i)$ or $I(B_j)$ is $1$ , then so is $I(Resolv(B_i, B_j))$ ". Convert that to a formula.

§ Resolving, one from $A$ , one from $B$

In very similar fashion, we build a resolvent of $I(Resolv(A_i, B_j))$ .
If we have resolved with $x \in A_i$ and $\lnot x \in B_i$ , and suppose that the resolvent lead to a UNSAT.
Now, we see that $x = F$ , then the contradiction came from $A_i$ , and so we must use the interpolant $I(A_i)$ in this case.
Similarly, if $x = T$ , then the contradiction came from $B_i$ , and we must use the interpolant $I(B_j)$ in this case.
Putting this together, we see that we want the formula $(\lnot x \implies I(A_i)) \land (x \implies I(B_i))$ .

§ Interpolant as "forward image"

Thinking once again from the UNSAT perspective, suppose that we are trying to prove that all executions are good.
Actually, I realized I don't understand this. will come back to it.

§ McMillan Viewpoint

Notes from McMillan
If $P$ and $Q$ are inconsistent, then there is a $P'$ is their shared language such that $P \models P'$ and $P' \land Q$ is inconsistent.
$P'$ is an abstraction of $P$ from the viewpoint of $Q$ . $P'$ summaries and translates into the shared language, why $P$ is inconsistent with $Q$ .
Let us have a transition system with $T(s, s')$ be the transition relation, $I$ b the initial condition, and $F$ the failure property. So the system is safe if no $T$ -reachable state from the initial state satisfies $F$ .
We will build a SAT formula where a satisfying assignment is a path to the failure.
Let $t_0(s_0) \equiv I(s_0)$ , $t_1(s_0, s_1) \equiv I(s_0) \land T(s_0, s_1)$ , and so on, with $t_{i+1} \equiv t_i \land T(s_i, s_{i+1})$ .
Suppose we show that $t_0 \land F(s_0)$ is UNSAT. This tells us that we do not fail in zero steps.
Suppose we show $\forall s_1, t_1(s_0, s_1) \land F(s_1)$ is UNSAT. This tells us that we do not fail at depth 1 for all possible reachable states.
Suppose we show $\forall s_1 s_2, t_2(s_0, s_1, s_2) \land F(s_2)$ is UNSAT. That is, we prove $\forall s_1 s_2, (I(s_0) \land T(s_0, s_1)) \land (T(s_1, s_2) \land F(s_2))$ is UNSAT.
If we think of the first part as $P \equiv I(s_0) \land T(s_0, s_1)$ , and the second part as $Q \equiv T(s_1, s_2) \land F(s_2)$ , then the common variables of $P$ and $Q$ is exactly $s_1$ , which represents the state $s_1$ . In general, it will be the propositional variables used to represent $s_1$ .
Since $P \land Q$ is UNSAT, we know that there is no counter-example of length $k$ . So, we compute interpolant $P'(s_2)$ .
Since $P'$ is an interpolant, we know that $P \models P'$ , which is to say, $I \land T(s_0, s_1) \models P'(s_1)$ . This means that $P'$ is true of all states that are reachable in one step!
In this sense, $P'$ is an overapproximation of the forward image of $I$ (of 1 step), which is $I \land T(s_0, s_1)$ .
Furthermore, we know that $P' \land Q$ is UNSAT, which means that $\forall s_1 s_2, P'(s_1) \land T(s_1, s_2) \land F(s_2)$ is UNSAT, we know that no state statisfying $P'$ can reach a bad state in $(k-1)$ steps.
This over-approximate image can be iterated to compute an over-approximsation of the reachable states.
McMillan Applications of Craig Interpolation in Model Checking

§ Interpolants: Vibes

§ Implication based definition

§ Existence proof for interpolant

§ Contradiction of Conjunction Viewpoint

§ Proof that implication interpolant implies conjunction unsat interpolant

§ Interpolant as telling us which clause is refuting

§ Case 1: contradiction from clause in AAA

§ Case 2: contradiction from clause in BBB.

§ Recipe for exponential sized CCC

§ Linear time algorithm for computing interpolants

§ Using clause from AAA

§ Using clause from BBB

§ Resolving, both from AAA.

§ Resolving, both from BBB.

§ Resolving, one from AAA, one from BBB

§ Interpolant as "forward image"

§ McMillan Viewpoint

§ Case 1: contradiction from clause in $A$

§ Case 2: contradiction from clause in $B$ .

§ Recipe for exponential sized $C$

§ Using clause from $A$

§ Using clause from $B$

§ Resolving, both from $A$ .

§ Resolving, both from $B$ .

§ Resolving, one from $A$ , one from $B$