I learnt of a nice, formal way to prove the correctness of Fenwick
trees in terms of orbits that I wish to reproduce here.
One can use a Fenwick tree to perform cumulative sums
Sum(n)≡∑i=0nA[i], and updates Upd(i,v)≡A[i]+=v. Naively,
cumulative sums can take O(n) time and updates take O(1) time.
A Fenwick tree can perform both in log(n). In general, we can perform
any monoid-based catenation and update in log(n), as long as our queries start from the 0th index till some index n. So we can only handle monoidal
fold queries of the form ∑i=0n and point updates.
We allow indexes [1,2,…n]. The node with factorization i≡2k×l,
2∣l (that is, k is the highest power of 2 in i)
is responsible for the interval [i−2k+1,i]=(i−2k,i].
I'm going to state all the manipulations in terms of prime factorizations,
since I find it far more intuitive than bit-fiddling. In general, I want
to find a new framework to discover and analyze bit-fiddling heavy algorithms.
Some examples of the range of responsibility of an index are:
To perform a cumulative sum, we need to read from the correct overlap regions
that cover the full array. For example, to read from 15, we would want
to read:
At each location, we strip off the value 2r. We can discover this value
with bit-fiddling: We claim that a&(−a)=2r.
Let a≡⟨x10r⟩. We wish to extract the output as 2r=⟨10r⟩,
which is the power of 2 that we need to subtract from a to strip the rightmost 1.
We get:
To perform an update at i, we need to update all locations which on querying
overlap with i. For example, to update the location 9, we would want to
update:
We wish to analyze the operations Query(q)≡∑i=1qa[i], and
Update(i,val)≡a[i]+=val. To do this, we are allowed to maintain
an auxiliary array d which we will manipuate. We will stipulate the
conditions of operations on d such that they will reflect the values of
Query and Update, albeit much faster.
We will analyze the algorithm in terms of orbits. We have two operators, one
for update called U, and one for query called Q. Given an index i,
repeatedly applying the query operator gives us the indeces we need to read and
accumulate from the underlying array a to get the total sum a[0..i]:
Query(i)=∑id[Qi(q)]
Given an index u, repeatedly applying the update operator U gives us all
the indeces we need to add the change to update:
Update(i,val)=∀j,d[Uj(i)]+=val
For query and update to work, we need the condition that:
q≥u⟺∣∣∣{Qi(q):i∈N}∩{Ui(u):i∈N}∣∣∣=1
That is, if and only if the query index q includes the update location u,
will the orbits intersect.
The intuition is that we want updates at an index u to only affect queries
that occur at indeces q≥u. Hence, we axiomatise that for an update
to be legal, it must the orbits of queries that are at indeces greater than it.
We will show that our operators:
Q(i=2r⋅a)=i−2r=2r(a−1)
U(j=2s⋅b)=j+2s=2s(b+1)
do satisfy the conditions above.
For a quick numerical check, we can use the code blow to ensure
that the orbits are indeed disjoint:
# calculate orbits of query and update in fenwick treedeflsb(i):return i &(-i)defU(i):return i + lsb(i)defQ(i):return i - lsb(i)deforbit(f, i):
s =set()while i notin s and i >0and i <64:
s.add(i); i = f(i)return s
if __name__ =="__main__":for q inrange(1,16):for u inrange(1,16):
qo = orbit(Q, q); uo = orbit(U, u)
c = qo.intersection(uo)print("q:%4s | u:%4s | qo: %20s | uo: %20s | qu: %4s"%(q, u, qo, uo, c))print("--")
We note that Q always decreases the value of q, and u always increases
it. Hence, if q=u, they meet at this point, and
∀i,j≥1,Qi(q)=Uj(u).
Hence, they meet exactly once as required.
Let the entire array have size 2N.
Let q=e1 f_q,u=e0 f_u, where
e, f_q, f_u may be empty strings.
Notice that Q will always strip away rightmost ones in fq,
leading to q=e10...0 at some point.
Similarly, U will keep on adding new rightmost ones, causing the
state to be u=e01...10...0Ue100....
Hence, at some point q=u.