§ Computing with High Dimensional Vectors
- Choose binary vectors.
- Lemma 1 (Johnson Lidenstrauss): any n points in high dimensional space can be embedded into O(log n / eps^2) dimensions with (1 +/- eps) distortion of distances.
- Lemma 2: Randomly chosen vectors are approximately orthogonal with high probability.
- multiplication is chosen to be XOR.
- Addition is chosen to be bitwise addition of components, followed by MAJORITY (keep components that are 1 in at least half the vectors).
- Addition creates vectors that are similar to input vectors.
- Multiplication creates vectors that are dissimilar to input vectors.
- Multiplication is invertible (XOR), and distributes over addition(?!).
- Permutation (rotation) is invertible, and distributes over addition (?!) and multiplication (?!). (I guess this makes intuitive sense, since it's a reindexing, but it's still at least a bit surprising.)
- mutiplication and permutation preserve similarity.
- This gives us a flexible algebraic system to compute with.
- Stanford Seminar: computing with high dimensional vectors