## § Interpolants: Vibes

• We shall explain what an interpolant in two ways.

#### § Implication based definition

• Let $A(p, q)$ and $B(q, r)$ be propositional formulae (where $p, q$ are vectors of variables) such that $A \implies B$.
• Then, there exists a formula $C(q)$ such that $A(p, q) \implies C(q)$ and $C(q) \implies B(q, r)$.
• So, the "crux" of why $A$ implies $B$ lies in $C$.
• One can ask: why is this useful? and how does one compute such an object? and why does sucn an object even exist?

#### § Existence proof for interpolant

• Choose a particular $A(p, q)$ and $B(q, r)$ such that $A(p, q) \implies B(q, r)$, We are trying to build an interpolant $C(q)$.
• For example, suppose $a, b, c$ are variables, and the assignment $M(a, b, c)$ be $a \mapsto 1$, $b \mapsto 0$, and $c \mapsto 1$. ( $M$ for model).
• Let $F(M(a, b c)))$ be a formula that is true when the values of variables in $a$ is equal to $M(a, b, c)$.
• Then $F(A(a, b, c)) \equiv a \land \lnot b \land c$.
• That is, it's the formula that's true when the values of $a, b, c$ match the assignment $M(a, b, c)$.
• Now, we will try to build an interpolant $C(q)$.
• TODO: this needs a bit more thought.

#### § Contradiction of Conjunction Viewpoint

• Let $A(p, q) \land B(q, r)$ be unsatisfiable.
• Then, interpolation implies that there exists a formula $C(q)$, such that $A(p, q) \implies C(q)$ and $C(q) \land B(q)$ is UNSAT.
• This tells us that "unsat core" of $A$ and $B$ actually lives in $C$.

#### § Proof that implication interpolant implies conjunction unsat interpolant

• We can phrase that $A(p, q) \land B(q, r)$ being UNSAT as saying that $\lnot (\lnot A \lor \lnot B)$ in UNSAT, or that $\lnot A \lor \lnot B$ as being TAUTO, or that $A \implies \lnot B$.
• We can apply interpolation to tell us that there is a formula such that $A(p, q) \implies C(q)$ such that $C(q) \implies \lnot B$.
• Reversing the logic, we can see that $C(q) \land B(q, r)$ must be UNSAT.
• Roughly, we use the fact that $A \implies B$ is a tautology implies that whenever $A$ is true, $B$ must be true. This means that $A \implies \lnot B$is UNSAT, which is the same as $\lnot A \lor \lnot B$ being unsat, which is the same as $\lnot (A \land B)$ being UNSAT.
• I would appreciate a more direct way of seeing this fact!

#### § Interpolant as telling us which clause is refuting

• From the UNSAT perspective, we know that $(A(p, q) \land B(q, r))$ is UNSAT.
• We can think of the interpolant $C(q)$ as telling us why the unsat happened.
• Recall that $C(q)$ is such that (i) $A(p, q) \implies C(q)$ and (ii) $C(q) \land B(q)$ is UNSAT.
• (i) implies $\lnot C(q) \implies \lnot A(p, q)$.
• (ii) implies $\lnot (C(q) \land B(q))$ is a tautology, or $\lnot C(q) \lor \lnot B(q)$, or $C(q) \implies \lnot B(q)$.
• So, consider a model $M$. We know that .
• Let's now test what value $C(q)$ ought to have given the two above constraints.

#### § Case 1: contradiction from clause in $A$

• Suppose the contradiction arose from a clause in $A$.
• We must have $M \models A(p, q) \implies C(q)$. This imposes no conditions on $A(p, q)$ since $A(p, q)$ is false, and thus the formula will be true regardless of $C(q)$.
• We must also have $M \lnot \models C(q) \land B(q, r)$.
• For this to happen regardless of $B(q, r)$, we must have $C(q) = 0$ in $M$.
• See that this also agrees with $M \models \lnot C(q) \implies \lnot A(p, q)$. So we can think of this as telling us to "label" $M$ with $C(q) = 0$ when the failure in $M \lnot \models A \land B$arises from $A$.

#### § Case 2: contradiction from clause in $B$.

• Suppose the contradiction arose from a clause in $B$.
• We must have $M \models A(p, q) \implies C(q)$. For this to be true regardless of the truth value of $\llbracket A(p, q) \rrbracket_M$, we must choose $C(q) = 1$.
• We must also have $M \lnot \models C(q) \land B(q, r)$. This will hold because $\llbracket B(q, r) \rrbracket_M = 0$, regardless of $C(q)$.
• See that this also agrees with $M \models C(q) \implies \lnot B(p, q)$. So we can think of this as telling us to "label" $M$ with $C(q) = 1$ when the failure in $M \lnot \models A \land B$arises from $B$.

#### § Recipe for exponential sized $C$

• Combining the two above, we can see that $C$ is really a labelling function, that labels for an UNSAT clause $A \land B$, and for an arbitrary model $M$, whether $M \lnot \models A \land B$ because of $A$ or because of $B$.
• If it's because of $A$, then $\llbracket C \rrbracket = F$, and otherwise it's because of $B$, so $\llbracket C \rrbracket_M = T$.
• This gives us a recipe to write a $C$ down.
• Create a clause of the form $\forall m \in M, (A(M) = 0 \implies F) \land (B(M) = 0 \implies T)$.
• This may not look propositional since we are quantifying over models.
• However, the set of models is finite, so we can build a gigantic (exponential in the number of variables) disjunct over all models, and create a huge clause.
• For example, if we want to know that we are in a model $[x_1 \mapsto T, x_2 \mapsto F]$, we can literally write the expression $x_1 = T \land x_2 = F$, or in more shorthand, $x_1 \land \lnot x_2$.
• This gives us an algorithm to compute an exponential sized interpolant, so we now know that interpolants always exist!

#### § Linear time algorithm for computing interpolants

• We now want an algorithm that gives us smaller interpolants.
• Note that this algorithm is linear in the size of the refutation proof , which could be exponentially large!
• The idea is to use the fact that the interpolant tells us which clause is refuting.
• So, recall from above that we have that (i) $\lnot C(q) \implies \lnot A(p, q)$, and that (ii) $C(q) \implies \lnot B(q, r)$.
• Next, see that in a resolution proof, we can walk a resolution proof "backwards" to figure out which clause gave rise to a contradiction.
• We will build a $C(q)$ by performing induction on the resolution proof, maintaining (i) and (ii) as invariants of the clauses we have seen so far.

#### § Using clause from $A$

• If we are using a clause $A_i$ of $A$, then the associated term for the interpolant will be $I(A_i) \equiv 0$. The idea is that if the UNSAT for a particular model $M$ came from $A_i$, then we must have that the interpolant $I(A_i)$ has value $0$ from the previous analysis. In this case, we know for sure that the UNSAT came from $A_i$, and thus the interpolant is just $0$.

#### § Using clause from $B$

• By the exact same argument, if we use clause $B_i$ of $B$, then the associated term for the interpolant will be $I(B_i) \equiv 1$.

#### § Resolving, both from $A$.

• Suppose we resolve two clauses $A_i, A_j$ Now, if the contradiction came from either $A_i$ or $A_j$, then this part of the resolution is the cause of UNSAT.
• Now see that $I(Resolv(A_i, A_j))$ should be $0$ if the UNSAT came from $A$, and $I(Resolve(A_i, A_j))$ should be $1$if the UNSAT came from $B$.
• But if this part of the resolution is the cause of UNSAT, then we will have either $I(A_i) = 0$ or $I(A_j) = 0$. In either of theses cases, the UNSAT came from $A$.
• So, we need $I(Resolv(A_i, A_j)) = 0$. To achieve this, we set $I(Resolv(A_i, A_j)) = I(A_i) \land I(A_j)$.
• Or, in less terse terms, "if $I(A_i)$ or $I(A_j)$ is $0$, then so is $I(Resolv(A_i, A_j))$". Convert that to a formula.

#### § Resolving, both from $B$.

• Suppose we resolve two clauses $B_i, B_j$ Now, if the contradiction came from either $B_i$ or $B_j$, then this part of the resolution is the cause of UNSAT.
• Now see that $I(Resolv(B_i, B_j))$ should be $0$ if the UNSAT came from $A$, and $I(Resolve(B_i, B_j))$ should be $1$if the UNSAT came from $B$.
• But if this part of the resolution is the cause of UNSAT, then we will have either $I(B_i) = 0$ or $I(B_j) = 0$. In either of theses cases, the UNSAT came from $B$.
• So, we need $I(Resolv(B_i, B_j)) = 1$. To achieve this, we set $I(Resolv(A_i, A_j)) = I(B_i) \lor I(B_j)$.
• Or, in less terse terms, "if $I(B_i)$ or $I(B_j)$ is $1$, then so is $I(Resolv(B_i, B_j))$". Convert that to a formula.

#### § Resolving, one from $A$, one from $B$

• In very similar fashion, we build a resolvent of $I(Resolv(A_i, B_j))$.
• If we have resolved with $x \in A_i$ and $\lnot x \in B_i$, and suppose that the resolvent lead to a UNSAT.
• Now, we see that $x = F$, then the contradiction came from $A_i$, and so we must use the interpolant $I(A_i)$ in this case.
• Similarly, if $x = T$, then the contradiction came from $B_i$, and we must use the interpolant $I(B_j)$ in this case.
• Putting this together, we see that we want the formula $(\lnot x \implies I(A_i)) \land (x \implies I(B_i))$.

#### § Interpolant as "forward image"

• Thinking once again from the UNSAT perspective, suppose that we are trying to prove that all executions are good.
• Actually, I realized I don't understand this. will come back to it.