§ Fenwick trees and orbits

I learnt of a nice, formal way to prove the correctness of Fenwick trees in terms of orbits that I wish to reproduce here. One can use a Fenwick tree to perform cumulative sums Sum(n)i=0nA[i]Sum(n) \equiv \sum_{i=0}^n A[i], and updates Upd(i,v)A[i]+=vUpd(i, v) \equiv A[i] += v. Naively, cumulative sums can take O(n)O(n) time and updates take O(1)O(1) time. A Fenwick tree can perform both in log(n)\log(n). In general, we can perform any monoid-based catenation and update in log(n)\log(n), as long as our queries start from the 0th index till some index nn. So we can only handle monoidal fold queries of the form i=0n\sum_{i=0}^n and point updates.

§ organization

We allow indexes [1,2,n][1, 2, \dots n]. The node with factorization i2k×li \equiv 2^k \times l, 2∤l2 \not \vert l (that is, kk is the highest power of 22 in ii) is responsible for the interval [i2k+1,i]=(i2k,i][i-2^k+1, i] = (i-2^k, i]. I'm going to state all the manipulations in terms of prime factorizations, since I find it far more intuitive than bit-fiddling. In general, I want to find a new framework to discover and analyze bit-fiddling heavy algorithms. Some examples of the range of responsibility of an index are:
  • 1=20×1=(0,1]1 = 2^0 \times 1 = (0, 1] (Subtract 20=12^0 = 1)
  • 2=2×1=(0,2]2 = 2\times 1 = (0, 2] (Subtract 21=22^1 = 2)
  • 3=3=(2,3]3 = 3 = (2, 3]
  • 4=22=(0,4]4 = 2^2 = (0, 4]
  • 5=5=(4,5]5 = 5 = (4, 5]
  • 6=2×3=(4,6]6 = 2\times 3 = (4, 6]
  • 7=7=(6,7]7 = 7 = (6,7]
  • 8=23=(0,8]8 = 2^3 = (0,8]
  • 9=9=(8,9]9 = 9 = (8, 9]
  • 10=2×5=(8,10]10 = 2\times 5 = (8, 10]
  • 11=11=(10,11]11 = 11 = (10, 11]
  • 12=22×3=(8,12]12 = 2^2\times 3 = (8, 12]
  • 13=13=(12,13]13 = 13 = (12, 13]
  • 14=2×7=(12,14]14 = 2\times 7 = (12, 14]
  • 15=15=(14,15]15 = 15 = (14, 15]
  • 16=24=(0,16]16 = 2^4 = (0, 16]

§ query

To perform a cumulative sum, we need to read from the correct overlap regions that cover the full array. For example, to read from 1515, we would want to read:
  • a[15]=(14,15],a[14]=(12,14],a[12]=(8,12],a[8]=(0,8]a[15] = (14, 15], a[14] = (12, 14], a[12] = (8, 12], a[8] = (0, 8].
So we need to read the indices:
  • 15=20152014=2172112=223228=23123015=2^0 \cdot 15 \xrightarrow{-2^0} 14=2^1 \cdot 7 \xrightarrow{-2^1} 12=2^2\cdot3 \xrightarrow{-2^2} 8=2^3\cdot1 \xrightarrow{-2^3} 0
At each location, we strip off the value 2r2^r. We can discover this value with bit-fiddling: We claim that a&(a)=2ra \& (-a) = 2^r. Let ax10ra \equiv \langle x 1 0^r \rangle. We wish to extract the output as 2r=10r2^r = \langle 1 0^r \rangle, which is the power of 2 that we need to subtract from aa to strip the rightmost 11. We get:
a=¬a+1=x01r+1=x10ra&(a)=a&(¬a+1)=(x10r)&(x10r)=0α10r=2r \begin{aligned} &-a = \lnot a + 1 = x01^r + 1 = \overline{x}10^r \\ &a \& (-a) = a \& (\lnot a + 1) = (x 10^r) \& (\overline{x}10^r) = 0^{|\alpha|}10^r = 2^r \end{aligned}
a(a&(a))=x10r10r=x00r a - (a \& (-a)) = \langle x 1 0^r \rangle - \langle 1 0^r \rangle = \langle x 0 0^r \rangle
That is, we successfully strip off the trailing 11. Armed with the theory, our implemtation becomes:
#define LSB(x) x&(-x)
int a[N];
int q(int i) {
    int s = 0;
    while (i > 0) {
        s += a[i];
        i -= LSB(i); // strip off trailing 1.
    }
    return s;
}

§ update

To perform an update at ii, we need to update all locations which on querying overlap with ii. For example, to update the location 99, we would want to update:
  • a[9]=(8,9],a[10]=(8,10],a[12]=(8,12],a[16]=(0,16]a[9] = (8, 9], a[10] = (8, 10], a[12] = (8, 12], a[16] = (0, 16].
So we need to update the indices:
  • 9=209+2010=215+2112=223+2216=241+249=2^0 \cdot 9 \xrightarrow{+2^0} 10=2^1 \cdot 5 \xrightarrow{+2^1} 12=2^2\cdot3 \xrightarrow{+2^2} 16=2^4\cdot1 \xrightarrow{+2^4} \dots
We use the same bit-fiddling technique as above to strip off the value 2r2^r
#define LSB(x) x&(-x)
int tree[N];
int u(int i, int v) {
    while (i < N) { tree[i] += v; i += LSB(i); }
}

§ correctness

We wish to analyze the operations Query(q)i=1qa[i]Query(q) \equiv \sum_{i=1}^q a[i], and Update(i,val)a[i] += valUpdate(i, val) \equiv a[i]~\texttt{+=}~val. To do this, we are allowed to maintain an auxiliary array dd which we will manipuate. We will stipulate the conditions of operations on dd such that they will reflect the values of QueryQuery and UpdateUpdate, albeit much faster. We will analyze the algorithm in terms of orbits. We have two operators, one for update called UU, and one for query called QQ. Given an index ii, repeatedly applying the query operator gives us the indeces we need to read and accumulate from the underlying array aa to get the total sum a[0..i]a[0..i]:
  • Query(i)=id[Qi(q)]Query(i) = \sum_i d[Q^i(q)]
Given an index uu, repeatedly applying the update operator UU gives us all the indeces we need to add the change to update:
  • Update(i,val)=j ,d[Uj(i)] += valUpdate(i, val) = \forall j~, d[U^j(i)]~\texttt{+=}~ val
For query and update to work, we need the condition that:
  • qu    {Qi(q) : iN}{Ui(u) : iN}=1q \geq u \iff \left\vert \{ Q^i(q)~:~ i \in \mathbb{N} \} \cap \{ U^i(u)~:~i \in \mathbb{N} \} \right\vert = 1
That is, if and only if the query index qq includes the update location uu, will the orbits intersect. The intuition is that we want updates at an index uu to only affect queries that occur at indeces quq \geq u. Hence, we axiomatise that for an update to be legal, it must the orbits of queries that are at indeces greater than it. We will show that our operators:
  • Q(i=2ra)=i2r=2r(a1)Q(i=2^r\cdot a) = i - 2^r = 2^r(a-1)
  • U(j=2sb)=j+2s=2s(b+1)U(j=2^s\cdot b) = j + 2^{s} = 2^{s}(b+1)
do satisfy the conditions above. For a quick numerical check, we can use the code blow to ensure that the orbits are indeed disjoint:
# calculate orbits of query and update in fenwick tree

def lsb(i): return i & (-i)
def U(i): return i + lsb(i)
def Q(i): return i - lsb(i)
def orbit(f, i):
    s = set()
    while i not in s and i > 0 and i < 64:
        s.add(i); i = f(i)
    return s

if __name__ == "__main__":
    for q in range(1, 16):
        for u in range(1, 16):
            qo = orbit(Q, q); uo = orbit(U, u)
            c = qo.intersection(uo)
            print("q:%4s | u:%4s | qo: %20s | uo: %20s | qu: %4s" %
                  (q, u, qo, uo, c))

        print("--")

§ Case 1: q=uq = u

We note that QQ always decreases the value of qq, and uu always increases it. Hence, if q=uq = u, they meet at this point, and i,j1,Qi(q)Uj(u)\forall i, j \geq 1, \quad Q^i (q) \neq U^j(u). Hence, they meet exactly once as required.

§ Case 2: q<uq < u

As noted above, qq always decreases and uu always increases, hence in this case they will never meet as required.

§ Case 3: q>uq > u

Let the entire array have size 2N2^N. Let q=e1 f_q,u=e0 f_uq = \texttt{e1 f\_q}, u = \texttt{e0 f\_u}, where e, f_q, f_u\texttt{e, f\_q, f\_u} may be empty strings. Notice that QQ will always strip away rightmost ones in fqf_q, leading to q=e10...0q = \texttt{e10...0} at some point. Similarly, UU will keep on adding new rightmost ones, causing the state to be u=e01...10...0Ue100...u = \texttt{e01...10...0} \xrightarrow{U} \texttt{e100...}. Hence, at some point q=uq = u.

§ References