I learnt of a nice, formal way to prove the correctness of Fenwick trees in terms of orbits that I wish to reproduce here. One can use a Fenwick tree to perform cumulative sums Sum(n)≡∑i=0nA[i], and updates Upd(i,v)≡A[i]+=v. Naively, cumulative sums can take O(n) time and updates take O(1) time. A Fenwick tree can perform both in log(n). In general, we can perform any monoid-based catenation and update in log(n), as long as our queries start from the 0th index till some index n. So we can only handle monoidal fold queries of the form ∑i=0n and point updates.
We allow indexes [1,2,…n]. The node with factorization i≡2k×l, 2∣l (that is, k is the highest power of 2 in i) is responsible for the interval [i−2k+1,i]=(i−2k,i]. I'm going to state all the manipulations in terms of prime factorizations, since I find it far more intuitive than bit-fiddling. In general, I want to find a new framework to discover and analyze bit-fiddling heavy algorithms. Some examples of the range of responsibility of an index are:
To perform a cumulative sum, we need to read from the correct overlap regions that cover the full array. For example, to read from 15, we would want to read:
At each location, we strip off the value 2r. We can discover this value with bit-fiddling: We claim that a&(−a)=2r. Let a≡⟨x10r⟩. We wish to extract the output as 2r=⟨10r⟩, which is the power of 2 that we need to subtract from a to strip the rightmost 1. We get:
To perform an update at i, we need to update all locations which on querying overlap with i. For example, to update the location 9, we would want to update:
We wish to analyze the operations Query(q)≡∑i=1qa[i], and Update(i,val)≡a[i]+=val. To do this, we are allowed to maintain an auxiliary array d which we will manipuate. We will stipulate the conditions of operations on d such that they will reflect the values of Query and Update, albeit much faster. We will analyze the algorithm in terms of orbits. We have two operators, one for update called U, and one for query called Q. Given an index i, repeatedly applying the query operator gives us the indeces we need to read and accumulate from the underlying array a to get the total sum a[0..i]:
Query(i)=∑id[Qi(q)]
Given an index u, repeatedly applying the update operator U gives us all the indeces we need to add the change to update:
Update(i,val)=∀j,d[Uj(i)]+=val
For query and update to work, we need the condition that:
q≥u⟺∣∣∣{Qi(q):i∈N}∩{Ui(u):i∈N}∣∣∣=1
That is, if and only if the query index q includes the update location u, will the orbits intersect. The intuition is that we want updates at an index u to only affect queries that occur at indeces q≥u. Hence, we axiomatise that for an update to be legal, it must the orbits of queries that are at indeces greater than it. We will show that our operators:
Q(i=2r⋅a)=i−2r=2r(a−1)
U(j=2s⋅b)=j+2s=2s(b+1)
do satisfy the conditions above. For a quick numerical check, we can use the code blow to ensure that the orbits are indeed disjoint:
# calculate orbits of query and update in fenwick treedeflsb(i):return i &(-i)defU(i):return i + lsb(i)defQ(i):return i - lsb(i)deforbit(f, i):
s =set()while i notin s and i >0and i <64:
s.add(i); i = f(i)return s
if __name__ =="__main__":for q inrange(1,16):for u inrange(1,16):
qo = orbit(Q, q); uo = orbit(U, u)
c = qo.intersection(uo)print("q:%4s | u:%4s | qo: %20s | uo: %20s | qu: %4s"%(q, u, qo, uo, c))print("--")
We note that Q always decreases the value of q, and u always increases it. Hence, if q=u, they meet at this point, and ∀i,j≥1,Qi(q)=Uj(u). Hence, they meet exactly once as required.
Let the entire array have size 2N. Let q=e1 f_q,u=e0 f_u, where e, f_q, f_u may be empty strings. Notice that Q will always strip away rightmost ones in fq, leading to q=e10...0 at some point. Similarly, U will keep on adding new rightmost ones, causing the state to be u=e01...10...0Ue100.... Hence, at some point q=u.