A Universe of Sorts

§ Fenwick trees and orbits

I learnt of a nice, formal way to prove the correctness of Fenwick trees in terms of orbits that I wish to reproduce here. One can use a Fenwick tree to perform cumulative sums

Sum(n) \equiv \sum_{i=0}^n A[i]

, and updates

Upd(i, v) \equiv A[i] += v

. Naively, cumulative sums can take

O(n)

time and updates take

O(1)

time. A Fenwick tree can perform both in

\log(n)

. In general, we can perform any monoid-based catenation and update in

\log(n)

, as long as our queries start from the 0th index till some index

n

. So we can only handle monoidal fold queries of the form

\sum_{i=0}^n

and point updates.

§ organization

We allow indexes

[1, 2, \dots n]

. The node with factorization

i \equiv 2^k \times l

2 \not \vert l

(that is,

k

is the highest power of

2

i

) is responsible for the interval

[i-2^k+1, i] = (i-2^k, i]

. I'm going to state all the manipulations in terms of prime factorizations, since I find it far more intuitive than bit-fiddling. In general, I want to find a new framework to discover and analyze bit-fiddling heavy algorithms. Some examples of the range of responsibility of an index are:

$1 = 2^0 \times 1 = (0, 1]$ (Subtract $2^0 = 1$ )
$2 = 2\times 1 = (0, 2]$ (Subtract $2^1 = 2$ )
$3 = 3 = (2, 3]$
$4 = 2^2 = (0, 4]$
$5 = 5 = (4, 5]$
$6 = 2\times 3 = (4, 6]$
$7 = 7 = (6,7]$
$8 = 2^3 = (0,8]$
$9 = 9 = (8, 9]$
$10 = 2\times 5 = (8, 10]$
$11 = 11 = (10, 11]$
$12 = 2^2\times 3 = (8, 12]$
$13 = 13 = (12, 13]$
$14 = 2\times 7 = (12, 14]$
$15 = 15 = (14, 15]$
$16 = 2^4 = (0, 16]$

§ query

To perform a cumulative sum, we need to read from the correct overlap regions that cover the full array. For example, to read from

15

, we would want to read:

$a[15] = (14, 15], a[14] = (12, 14], a[12] = (8, 12], a[8] = (0, 8]$ .

So we need to read the indices:

$15=2^0 \cdot 15 \xrightarrow{-2^0} 14=2^1 \cdot 7 \xrightarrow{-2^1} 12=2^2\cdot3 \xrightarrow{-2^2} 8=2^3\cdot1 \xrightarrow{-2^3} 0$

At each location, we strip off the value

2^r

. We can discover this value with bit-fiddling: We claim that

a \& (-a) = 2^r

. Let

a \equiv \langle x 1 0^r \rangle

. We wish to extract the output as

2^r = \langle 1 0^r \rangle

, which is the power of 2 that we need to subtract from

a

to strip the rightmost

1

. We get:

\begin{aligned} &-a = \lnot a + 1 = x01^r + 1 = \overline{x}10^r \\ &a \& (-a) = a \& (\lnot a + 1) = (x 10^r) \& (\overline{x}10^r) = 0^{|\alpha|}10^r = 2^r \end{aligned}

a - (a \& (-a)) = \langle x 1 0^r \rangle - \langle 1 0^r \rangle = \langle x 0 0^r \rangle

That is, we successfully strip off the trailing

1

. Armed with the theory, our implemtation becomes:

#define LSB(x) x&(-x)
int a[N];
int q(int i) {
    int s = 0;
    while (i > 0) {
        s += a[i];
        i -= LSB(i); // strip off trailing 1.
    }
    return s;
}

§ update

To perform an update at

i

, we need to update all locations which on querying overlap with

i

. For example, to update the location

9

, we would want to update:

$a[9] = (8, 9], a[10] = (8, 10], a[12] = (8, 12], a[16] = (0, 16]$ .

So we need to update the indices:

$9=2^0 \cdot 9 \xrightarrow{+2^0} 10=2^1 \cdot 5 \xrightarrow{+2^1} 12=2^2\cdot3 \xrightarrow{+2^2} 16=2^4\cdot1 \xrightarrow{+2^4} \dots$

We use the same bit-fiddling technique as above to strip off the value

2^r

#define LSB(x) x&(-x)
int tree[N];
int u(int i, int v) {
    while (i < N) { tree[i] += v; i += LSB(i); }
}

§ correctness

We wish to analyze the operations

Query(q) \equiv \sum_{i=1}^q a[i]

, and

Update(i, val) \equiv a[i]~\texttt{+=}~val

. To do this, we are allowed to maintain an auxiliary array

d

which we will manipuate. We will stipulate the conditions of operations on

d

such that they will reflect the values of

Query

and

Update

, albeit much faster. We will analyze the algorithm in terms of orbits. We have two operators, one for update called

U

, and one for query called

Q

. Given an index

i

, repeatedly applying the query operator gives us the indeces we need to read and accumulate from the underlying array

a

to get the total sum

a[0..i]

$Query(i) = \sum_i d[Q^i(q)]$

Given an index

u

, repeatedly applying the update operator

U

gives us all the indeces we need to add the change to update:

$Update(i, val) = \forall j~, d[U^j(i)]~\texttt{+=}~ val$

For query and update to work, we need the condition that:

$q \geq u \iff \left\vert \{ Q^i(q)~:~ i \in \mathbb{N} \} \cap \{ U^i(u)~:~i \in \mathbb{N} \} \right\vert = 1$

That is, if and only if the query index

q

includes the update location

u

, will the orbits intersect. The intuition is that we want updates at an index

u

to only affect queries that occur at indeces

q \geq u

. Hence, we axiomatise that for an update to be legal, it must the orbits of queries that are at indeces greater than it. We will show that our operators:

$Q(i=2^r\cdot a) = i - 2^r = 2^r(a-1)$
$U(j=2^s\cdot b) = j + 2^{s} = 2^{s}(b+1)$

do satisfy the conditions above. For a quick numerical check, we can use the code blow to ensure that the orbits are indeed disjoint:

# calculate orbits of query and update in fenwick tree

def lsb(i): return i & (-i)
def U(i): return i + lsb(i)
def Q(i): return i - lsb(i)
def orbit(f, i):
    s = set()
    while i not in s and i > 0 and i < 64:
        s.add(i); i = f(i)
    return s

if __name__ == "__main__":
    for q in range(1, 16):
        for u in range(1, 16):
            qo = orbit(Q, q); uo = orbit(U, u)
            c = qo.intersection(uo)
            print("q:%4s | u:%4s | qo: %20s | uo: %20s | qu: %4s" %
                  (q, u, qo, uo, c))

        print("--")

§ Case 1: $q = u$

We note that

Q

always decreases the value of

q

, and

u

always increases it. Hence, if

q = u

, they meet at this point, and

\forall i, j \geq 1, \quad Q^i (q) \neq U^j(u)

. Hence, they meet exactly once as required.

§ Case 2: $q < u$

As noted above,

q

always decreases and

u

always increases, hence in this case they will never meet as required.

§ Case 3: $q > u$

Let the entire array have size

2^N

. Let

q = \texttt{e1 f\_q}, u = \texttt{e0 f\_u}

, where

\texttt{e, f\_q, f\_u}

may be empty strings. Notice that

Q

will always strip away rightmost ones in

f_q

, leading to

q = \texttt{e10...0}

at some point. Similarly,

U

will keep on adding new rightmost ones, causing the state to be

u = \texttt{e01...10...0} \xrightarrow{U} \texttt{e100...}

. Hence, at some point

q = u

§ Fenwick trees and orbits

§ organization

§ query

§ update

§ correctness

§ Case 1: q=uq = uq=u

§ Case 2: q<uq < uq<u

§ Case 3: q>uq > uq>u

§ References

§ Case 1: $q = u$

§ Case 2: $q < u$

§ Case 3: $q > u$