L
:
E = T +_mod8 E | T -_mod8 E | T
T = V | V ^ V | V ^ V ^ V
V = 'a1' | 'a2' | ...
where +_mod8
is addition modulo 8, -_mod8
is subtraction modulo 8,
and ^
is XOR.
This language is equipped with the obvious
evaluation rules, corresponding to those of arithmetic. We are guaranteed
that during evaluation, the variables a_i
will only have values 0
and 1
.
Since we have addition, we can perform multiplication by a constant
by repeated addition. So we can perform 3*a
as a+a+a
.
We are then given the input expression (a0 ^ a1 ^ a2 ^ a3)
. We wish
to find an equivalent expression in terms of the above language L
.
We think of E
as some set of logic gates we are allowed to use, and we are
trying to express the operation (a0 ^ a1 ^ a2 ^ a3)
in terms of these gates.
The first idea that I thought was that of employing a grobner basis,
since they essentially embody rewrite rules modulo polynomial equalities, which
is precisely our setting here.
In this blog post, I'm going to describe what a grobner basis is and why it's
natural to reach for them to solve this problem, the code, and the eventual
solution.
As a spolier, the solution is:
a^b^c^d =
-a - b + c + 3*d - 3*axorb - axorc
+ axord - bxorc + bxord + 3*cxord
- 3*axorbxorc - axorbxord
+ axorcxord + bxorcxord
Clearly, this contains only additions/subtractions and multiplications by
a constant.
If there's some principled way to derive this (beyond throwing symbolic
algebra machinery), I'd really love to know ---
Please raise an issue with the explanation!
A -> -1*B -- (1)
C -> B^2 -- (2)
over an alphabet {A, B, C}
.
Now, given the string C + AB
, we wish to find out if it can be rewritten to
0
or not. Let's try to substitute and see what happens:
C + AB -2-> B^2 + AB -1-> B^2 + (-1*B)B
At this point, we're stuck! we don't have rewrite rules to allow us to
rewrite (-1*B)B
into -B^2
. Indeed, creating such a list would be
infinitely long. But if we are willing to accept that we somehow have
the rewrite rules that correspond to polynomial arithmetic, where we view
A, B, C
as variables, then we can rewrite the above string to 0:
B^2 + (-1*B)B -> B^2 - B^2 -> 0
A Grobner basis is the algorithmic / mathematical machine that allows us
to perform this kind of substitution.
In this example, this might appear stupid: what is so special? We simply
substituted variables and arrived at 0
by using arithmetic. What's
so complicated about that? To understand why this is not always so easy,
let's consider a pathological, specially constructed example
A -> 1 -- (1)
AB -> -B^2 -- (2)
And we consider the string S = AB + B^2
. If we blindly apply the
first rule, we arrive at:
S = AB + B^2 -1-> 1B + B^2 = B + B^2 (STUCK)
However, if we apply (2)
and then (1)
:
S = AB + B^2 -2-> -B^2 + B^2 -> 0
This tells us that we can't just apply the rewrite rules willy-nilly .
It's sensitive to the order of the rewrites! That is, the rewrite system
is not confluent ).
The grobner basis is a function from rewrite systems to rewrite systems.
When given a rewrite system R
, it produces a new rewrite system R'
that is confluent . So, we can apply the rewrite rules of R'
in any order,
and we guaranteed that we will only get a 0 from R'
if and only if
we could have gotten a 0
from R
for all strings.
We can then go on to phrase this whole rewriting setup in the language of
ideals from ring theory, and that is the language in which it is most
often described. I've gone into more depth on that perspective here:
"What is a grobner basis? polynomial factorization as rewrite systems" .
Now that we have a handle on what a grobner basis is, let's go on to solve
the original problem:
RewriteThe idea is to construct the polynomial ring overa^b^c
in terms ofa^b
,b^c
,c^a
and the same+_mod8
instruction set as the original problem. The only difference this time is that we do not haveT -> V ^ V ^ V
.
Z/8Z
(integers modulo 8) with
variables as a, b, c, axorb, bxorc, axorc
. Now, we know that a^b = a + b - 2ab
. So,
we setup rewrite rules such that a + b - 2ab -> axorb
, b + c - 2bc -> bxorb
,
c + a - 2ca -> cxora
.
We construct the polynomial f(a, b, c) = a^b^c
, which
has been written in terms of addition and multiplication, defined
as f_orig(a, b, c) = 4*a*b*c - 2*a*b - 2*a*c - 2*b*c + a + b + c
. We then
rewrite f_orig
with respect to our rewrite rules. Hopefully, the rewrite
rules should give us a clean expression in terms of one variable and
two-variable xor
s. There is the danger that we may have some term
such as a * bxorc
, and we do get such a term ( 2*b*axorc
) in this case,
but it does not appear in the original problem.
# Create ring with variables a, b, c, axorb, bxorc, axorc
R = IntegerModRing(8)['a, b, c, axorb, bxorc, axorc']
(a, b, c, axorb, bxorc, axorc) = R.gens()
# xor of 2 numbers as a polynomial
def xor2(x, y): return x + y - 2*x*y
# xor of 3 numbers as a polynomial
def xor3(x, y, z): return xor2(x, xor2(y, z))
# define the ideal which contains relations:
# xor2(a, b) -> axorb, xor2(b, c) -> bxorc, xor2(a, c) -> axorc
# we also add the relation (a^2 - a = 0 => a = 0 or a = 1)
# since we know that our variables are only {0, 1}
I = ideal((axorb - xor2(a, b), bxorc - xor2(b, c), axorc - xor2(a, c), a*a-a, b*b-b, c*c-c))
# the polynomial representing a^b^c we wish to reduce
f_orig = xor3(a, b, c)
# we take the groebner basis of the ring to reduce the polynomial f.
IG = I.groebner_basis()
# we reduce a^b^c with respect to the groebner basis.
f_reduced = f_orig.reduce(IG)
print("value of a^b^c:\n\t%s\n\treduced: %s" % (f_orig, f_reduced))
# Code to evaluate the function `f` on all inputs to check correctness
def evalxor2(f):
for (i, j, k) in [(i, j, k) for i in [0, 1] for j in [0, 1] for k in [0, 1]]:
ref = i^^j^^k
eval = f.substitute(a=i, b=j, c=k, axorb=i^^j, bxorc=j^^k, axorc=i^^k)
print("%s^%s^%s: ref(%s) =?= f(%s): %s" %
(i, j, k, ref, eval, ref == eval))
# check original formulation is correct
print("evaulating original f for sanity check:")
evalxor2(f_orig)
# Check reduced formulation is correct
print("evaulating reduced f:")
evalxor2(f_reduced)
Running the code gives us the reduced polynomial -2*b*axorc + b + axorc
which unfortunately contains a term that is b * axorc
. So, this approach
does not work, and I was informed by my friend that she is unaware
of a solution to this problem (writing a^b^c
in terms of smaller xors and
sums).
The full code output is:
value of a^b^c:
4*a*b*c - 2*a*b - 2*a*c - 2*b*c + a + b + c
reduced: -2*b*axorc + b + axorc
evaulating original f for sanity check:
0^0^0: ref(0) =?= f(0): True
0^0^1: ref(1) =?= f(1): True
0^1^0: ref(1) =?= f(1): True
0^1^1: ref(0) =?= f(0): True
1^0^0: ref(1) =?= f(1): True
1^0^1: ref(0) =?= f(0): True
1^1^0: ref(0) =?= f(0): True
1^1^1: ref(1) =?= f(1): True
evaulating reduced f:
0^0^0: ref(0) =?= f(0): True
0^0^1: ref(1) =?= f(1): True
0^1^0: ref(1) =?= f(1): True
0^1^1: ref(0) =?= f(0): True
1^0^0: ref(1) =?= f(1): True
1^0^1: ref(0) =?= f(0): True
1^1^0: ref(0) =?= f(0): True
1^1^1: ref(1) =?= f(1): True
That is, both the original polynomial and the reduced polynomial match
the expected results. But the reduced polynomial is not in our language L
,
since it has a term that is a product of b
with axorc
.
a ^ b ^ c ^ d
. We find that the reduced polynomial is
-a - b + c + 3*d - 3*axorb - axorc
+ axord - bxorc + bxord + 3*cxord
- 3*axorbxorc - axorbxord
+ axorcxord + bxorcxord
which happily has no products between terms! It also passes our sanity
check, so we've now found the answer.
The full output is:
value of a^b^c^d:
4*a*b*c + 4*a*b*d + 4*a*c*d + 4*b*c*d
- 2*a*b - 2*a*c - 2*b*c - 2*a*d
- 2*b*d - 2*c*d + a + b + c + d
reduced: -a - b + c + 3*d - 3*axorb
- axorc + axord - bxorc + bxord
+ 3*cxord - 3*axorbxorc
- axorbxord + axorcxord + bxorcxord
evaluating original a^b^c^d
0^0^0^0: ref(0) =?= f(0): True
0^0^0^1: ref(1) =?= f(1): True
0^0^1^0: ref(1) =?= f(1): True
0^0^1^1: ref(0) =?= f(0): True
0^1^0^0: ref(1) =?= f(1): True
0^1^0^1: ref(0) =?= f(0): True
0^1^1^0: ref(0) =?= f(0): True
0^1^1^1: ref(1) =?= f(1): True
1^0^0^0: ref(1) =?= f(1): True
1^0^0^1: ref(0) =?= f(0): True
1^0^1^0: ref(0) =?= f(0): True
1^0^1^1: ref(1) =?= f(1): True
1^1^0^0: ref(0) =?= f(0): True
1^1^0^1: ref(1) =?= f(1): True
1^1^1^0: ref(1) =?= f(1): True
1^1^1^1: ref(0) =?= f(0): True
evaluating reduced a^b^c^d
0^0^0^0: ref(0) =?= f(0): True
0^0^0^1: ref(1) =?= f(1): True
0^0^1^0: ref(1) =?= f(1): True
0^0^1^1: ref(0) =?= f(0): True
0^1^0^0: ref(1) =?= f(1): True
0^1^0^1: ref(0) =?= f(0): True
0^1^1^0: ref(0) =?= f(0): True
0^1^1^1: ref(1) =?= f(1): True
1^0^0^0: ref(1) =?= f(1): True
1^0^0^1: ref(0) =?= f(0): True
1^0^1^0: ref(0) =?= f(0): True
1^0^1^1: ref(1) =?= f(1): True
1^1^0^0: ref(0) =?= f(0): True
1^1^0^1: ref(1) =?= f(1): True
1^1^1^0: ref(1) =?= f(1): True
1^1^1^1: ref(0) =?= f(0): True
a^b^c^d
reduction: def xor3(x, y, z): return xor2(x, xor2(y, z))
R = IntegerModRing(8)['a, b, c, d, axorb, axorc, axord, bxorc, \
bxord, cxord, axorbxorc, axorbxord, axorcxord, bxorcxord']
(a, b, c, d, axorb, axorc, axord, bxorc, bxord, cxord, axorbxorc, \
axorbxord, axorcxord, bxorcxord) = R.gens()
I = ideal((axorb - xor2(a, b),
axorc - xor2(a, c),
axord - xor2(a, d),
bxorc - xor2(b, c),
bxord - xor2(b, d),
cxord - xor2(c, d),
axorbxorc - xor3(a, b, c),
axorbxord - xor3(a, b, d),
axorcxord - xor3(a, c, d),
bxorcxord - xor3(b, c, d),
a*a-a,
b*b-b,
c*c-c,
d*d-d
))
IG = I.groebner_basis()
f_orig = (xor2(a, xor2(b, xor2(c, d))))
f_reduced = f_orig.reduce(IG)
print("value of a^b^c^d:\n\t%s\n\treduced: %s" % (f_orig, f_reduced))
def evalxor3(f):
for (i, j, k, l) in [(i, j, k, l) for i in [0, 1] \
for j in [0, 1] \
for k in [0, 1] \
for l in [0, 1]]:
ref = i^^j^^k^^l
eval = f.substitute(a=i, b=j, c=k, d=l,
axorb=i^^j, axorc=i^^k,
axord=i^^l, bxorc=j^^k,
bxord=j^^l, cxord=k^^l,
axorbxorc=i^^j^^k, axorbxord=i^^j^^l,
axorcxord=i^^k^^l, bxorcxord=j^^k^^l)
print("%s^%s^%s^%s: ref(%s) =?= f(%s): %s" %
(i, j, k, l, ref, eval, ref == eval))
print("evaluating original a^b^c^d")
evalxor3(f_orig)
print("evaluating reduced a^b^c^d")
evalxor3(f_reduced)