§ Computing equivalent gate sets using grobner bases
Here's a fun little problem, whose only solution I know involves a fair
bit of math and computer algebra:
We are given the grammar for a language L
:
E = T +_mod8 E | T -_mod8 E | T
T = V | V ^ V | V ^ V ^ V
V = 'a1' | 'a2' | ...
where +_mod8
is addition modulo 8, -_mod8
is subtraction modulo 8,
and ^
is XOR.
This language is equipped with the obvious
evaluation rules, corresponding to those of arithmetic. We are guaranteed
that during evaluation, the variables a_i
will only have values 0
and 1
.
Since we have addition, we can perform multiplication by a constant
by repeated addition. So we can perform 3*a
as a+a+a
.
We are then given the input expression (a0 ^ a1 ^ a2 ^ a3)
. We wish
to find an equivalent expression in terms of the above language L
.
We think of E
as some set of logic gates we are allowed to use, and we are
trying to express the operation (a0 ^ a1 ^ a2 ^ a3)
in terms of these gates.
The first idea that I thought was that of employing a grobner basis,
since they essentially embody rewrite rules modulo polynomial equalities, which
is precisely our setting here.
In this blog post, I'm going to describe what a grobner basis is and why it's
natural to reach for them to solve this problem, the code, and the eventual
solution.
As a spolier, the solution is:
a^b^c^d =
-a - b + c + 3*d - 3*axorb - axorc
+ axord - bxorc + bxord + 3*cxord
- 3*axorbxorc - axorbxord
+ axorcxord + bxorcxord
Clearly, this contains only additions/subtractions and multiplications by
a constant.
If there's some principled way to derive this (beyond throwing symbolic
algebra machinery), I'd really love to know ---
Please raise an issue with the explanation!
§ What the hell is Grobner Basis?
The nutshell is that a grobner basis is a way to construct rewrite rules which
also understand arithmetic (I learnt this viewpoint from the book "Term
rewriting and all that". Fantastic book in general). Expanding on the
nutshell, assume we have a term rewriting system:
A -> -1*B -- (1)
C -> B^2 -- (2)
over an alphabet {A, B, C}
.
Now, given the string C + AB
, we wish to find out if it can be rewritten to
0
or not. Let's try to substitute and see what happens:
C + AB -2-> B^2 + AB -1-> B^2 + (-1*B)B
At this point, we're stuck! we don't have rewrite rules to allow us to
rewrite (-1*B)B
into -B^2
. Indeed, creating such a list would be
infinitely long. But if we are willing to accept that we somehow have
the rewrite rules that correspond to polynomial arithmetic, where we view
A, B, C
as variables, then we can rewrite the above string to 0:
B^2 + (-1*B)B -> B^2 - B^2 -> 0
A Grobner basis is the algorithmic / mathematical machine that allows us
to perform this kind of substitution.
In this example, this might appear stupid: what is so special? We simply
substituted variables and arrived at 0
by using arithmetic. What's
so complicated about that? To understand why this is not always so easy,
let's consider a pathological, specially constructed example
§ A complicated example that shatters dreams
Here's the pathological example:
A -> 1 -- (1)
AB -> -B^2 -- (2)
And we consider the string S = AB + B^2
. If we blindly apply the
first rule, we arrive at:
S = AB + B^2 -1-> 1B + B^2 = B + B^2 (STUCK)
However, if we apply (2)
and then (1)
:
S = AB + B^2 -2-> -B^2 + B^2 -> 0
This tells us that we can't just apply the rewrite rules willy-nilly .
It's sensitive to the order of the rewrites! That is, the rewrite system
is not confluent ).
The grobner basis is a function from rewrite systems to rewrite systems.
When given a rewrite system R
, it produces a new rewrite system R'
that is confluent . So, we can apply the rewrite rules of R'
in any order,
and we guaranteed that we will only get a 0 from R'
if and only if
we could have gotten a 0
from R
for all strings.
We can then go on to phrase this whole rewriting setup in the language of
ideals from ring theory, and that is the language in which it is most
often described. I've gone into more depth on that perspective here:
"What is a grobner basis? polynomial factorization as rewrite systems" .
Now that we have a handle on what a grobner basis is, let's go on to solve
the original problem:
§ An explanation through a slightly simpler problem
I'll first demonstrate the idea of how to solve the original problem
by solving a slightly simpler problem:
Rewrite a^b^c
in terms of a^b
, b^c
, c^a
and the same +_mod8
instruction
set as the original problem. The only difference this time
is that we do not have T -> V ^ V ^ V
.
The idea is to construct the polynomial ring over Z/8Z
(integers modulo 8) with
variables as a, b, c, axorb, bxorc, axorc
. Now, we know that a^b = a + b - 2ab
. So,
we setup rewrite rules such that a + b - 2ab -> axorb
, b + c - 2bc -> bxorb
,
c + a - 2ca -> cxora
.
We construct the polynomial f(a, b, c) = a^b^c
, which
has been written in terms of addition and multiplication, defined
as f_orig(a, b, c) = 4*a*b*c - 2*a*b - 2*a*c - 2*b*c + a + b + c
. We then
rewrite f_orig
with respect to our rewrite rules. Hopefully, the rewrite
rules should give us a clean expression in terms of one variable and
two-variable xor
s. There is the danger that we may have some term
such as a * bxorc
, and we do get such a term ( 2*b*axorc
) in this case,
but it does not appear in the original problem.
R = IntegerModRing(8)['a, b, c, axorb, bxorc, axorc']
(a, b, c, axorb, bxorc, axorc) = R.gens()
def xor2(x, y): return x + y - 2*x*y
def xor3(x, y, z): return xor2(x, xor2(y, z))
I = ideal((axorb - xor2(a, b), bxorc - xor2(b, c), axorc - xor2(a, c), a*a-a, b*b-b, c*c-c))
f_orig = xor3(a, b, c)
IG = I.groebner_basis()
f_reduced = f_orig.reduce(IG)
print("value of a^b^c:\n\t%s\n\treduced: %s" % (f_orig, f_reduced))
def evalxor2(f):
for (i, j, k) in [(i, j, k) for i in [0, 1] for j in [0, 1] for k in [0, 1]]:
ref = i^^j^^k
eval = f.substitute(a=i, b=j, c=k, axorb=i^^j, bxorc=j^^k, axorc=i^^k)
print("%s^%s^%s: ref(%s) =?= f(%s): %s" %
(i, j, k, ref, eval, ref == eval))
print("evaulating original f for sanity check:")
evalxor2(f_orig)
print("evaulating reduced f:")
evalxor2(f_reduced)
Running the code gives us the reduced polynomial -2*b*axorc + b + axorc
which unfortunately contains a term that is b * axorc
. So, this approach
does not work, and I was informed by my friend that she is unaware
of a solution to this problem (writing a^b^c
in terms of smaller xors and
sums).
The full code output is:
value of a^b^c:
4*a*b*c - 2*a*b - 2*a*c - 2*b*c + a + b + c
reduced: -2*b*axorc + b + axorc
evaulating original f for sanity check:
0^0^0: ref(0) =?= f(0): True
0^0^1: ref(1) =?= f(1): True
0^1^0: ref(1) =?= f(1): True
0^1^1: ref(0) =?= f(0): True
1^0^0: ref(1) =?= f(1): True
1^0^1: ref(0) =?= f(0): True
1^1^0: ref(0) =?= f(0): True
1^1^1: ref(1) =?= f(1): True
evaulating reduced f:
0^0^0: ref(0) =?= f(0): True
0^0^1: ref(1) =?= f(1): True
0^1^0: ref(1) =?= f(1): True
0^1^1: ref(0) =?= f(0): True
1^0^0: ref(1) =?= f(1): True
1^0^1: ref(0) =?= f(0): True
1^1^0: ref(0) =?= f(0): True
1^1^1: ref(1) =?= f(1): True
That is, both the original polynomial and the reduced polynomial match
the expected results. But the reduced polynomial is not in our language L
,
since it has a term that is a product of b
with axorc
.
§ Tacking the original problem.
We try the exact same approach to the original problem of expressing
a ^ b ^ c ^ d
. We find that the reduced polynomial is
-a - b + c + 3*d - 3*axorb - axorc
+ axord - bxorc + bxord + 3*cxord
- 3*axorbxorc - axorbxord
+ axorcxord + bxorcxord
which happily has no products between terms! It also passes our sanity
check, so we've now found the answer.
The full output is:
value of a^b^c^d:
4*a*b*c + 4*a*b*d + 4*a*c*d + 4*b*c*d
- 2*a*b - 2*a*c - 2*b*c - 2*a*d
- 2*b*d - 2*c*d + a + b + c + d
reduced: -a - b + c + 3*d - 3*axorb
- axorc + axord - bxorc + bxord
+ 3*cxord - 3*axorbxorc
- axorbxord + axorcxord + bxorcxord
evaluating original a^b^c^d
0^0^0^0: ref(0) =?= f(0): True
0^0^0^1: ref(1) =?= f(1): True
0^0^1^0: ref(1) =?= f(1): True
0^0^1^1: ref(0) =?= f(0): True
0^1^0^0: ref(1) =?= f(1): True
0^1^0^1: ref(0) =?= f(0): True
0^1^1^0: ref(0) =?= f(0): True
0^1^1^1: ref(1) =?= f(1): True
1^0^0^0: ref(1) =?= f(1): True
1^0^0^1: ref(0) =?= f(0): True
1^0^1^0: ref(0) =?= f(0): True
1^0^1^1: ref(1) =?= f(1): True
1^1^0^0: ref(0) =?= f(0): True
1^1^0^1: ref(1) =?= f(1): True
1^1^1^0: ref(1) =?= f(1): True
1^1^1^1: ref(0) =?= f(0): True
evaluating reduced a^b^c^d
0^0^0^0: ref(0) =?= f(0): True
0^0^0^1: ref(1) =?= f(1): True
0^0^1^0: ref(1) =?= f(1): True
0^0^1^1: ref(0) =?= f(0): True
0^1^0^0: ref(1) =?= f(1): True
0^1^0^1: ref(0) =?= f(0): True
0^1^1^0: ref(0) =?= f(0): True
0^1^1^1: ref(1) =?= f(1): True
1^0^0^0: ref(1) =?= f(1): True
1^0^0^1: ref(0) =?= f(0): True
1^0^1^0: ref(0) =?= f(0): True
1^0^1^1: ref(1) =?= f(1): True
1^1^0^0: ref(0) =?= f(0): True
1^1^0^1: ref(1) =?= f(1): True
1^1^1^0: ref(1) =?= f(1): True
1^1^1^1: ref(0) =?= f(0): True
§ code for a^b^c^d
reduction:
def xor3(x, y, z): return xor2(x, xor2(y, z))
R = IntegerModRing(8)['a, b, c, d, axorb, axorc, axord, bxorc, \
bxord, cxord, axorbxorc, axorbxord, axorcxord, bxorcxord']
(a, b, c, d, axorb, axorc, axord, bxorc, bxord, cxord, axorbxorc, \
axorbxord, axorcxord, bxorcxord) = R.gens()
I = ideal((axorb - xor2(a, b),
axorc - xor2(a, c),
axord - xor2(a, d),
bxorc - xor2(b, c),
bxord - xor2(b, d),
cxord - xor2(c, d),
axorbxorc - xor3(a, b, c),
axorbxord - xor3(a, b, d),
axorcxord - xor3(a, c, d),
bxorcxord - xor3(b, c, d),
a*a-a,
b*b-b,
c*c-c,
d*d-d
))
IG = I.groebner_basis()
f_orig = (xor2(a, xor2(b, xor2(c, d))))
f_reduced = f_orig.reduce(IG)
print("value of a^b^c^d:\n\t%s\n\treduced: %s" % (f_orig, f_reduced))
def evalxor3(f):
for (i, j, k, l) in [(i, j, k, l) for i in [0, 1] \
for j in [0, 1] \
for k in [0, 1] \
for l in [0, 1]]:
ref = i^^j^^k^^l
eval = f.substitute(a=i, b=j, c=k, d=l,
axorb=i^^j, axorc=i^^k,
axord=i^^l, bxorc=j^^k,
bxord=j^^l, cxord=k^^l,
axorbxorc=i^^j^^k, axorbxord=i^^j^^l,
axorcxord=i^^k^^l, bxorcxord=j^^k^^l)
print("%s^%s^%s^%s: ref(%s) =?= f(%s): %s" %
(i, j, k, l, ref, eval, ref == eval))
print("evaluating original a^b^c^d")
evalxor3(f_orig)
print("evaluating reduced a^b^c^d")
evalxor3(f_reduced)
§ Closing thoughts
This was a really fun exercise: Around a hundred lines of code illuminates
the use of machinery such as grobner basis for solving real-world problems!
I really enjoyed hacking this up and getting nerd sniped.