A Universe of Sorts

§ Finite differences and Umbral calculus

Umbral calculus lays out a large collection of "umbral" / "shadowy" coincidences across combinatorics and calculus. Here, I'll lay out some of these that I learnt from Concrete Mathematics . I hope to use this for myself to motivate a bunch of combinatorics. I'll provide an interesting proof of why

\sum{1 \leq k < n} k^2 = k(k-1)/2

using this umbral calculus.

§ Discrete Derivative: Forward difference

We begin by trying to build a discrete version of the derivative operator. The derivative of f will be denoted using f'. The forward difference of

f: \mathbb R \rightarrow \mathbb R

is defined as:

\delta f: \mathbb R \rightarrow \mathbb R; (\delta f)(x) = f(x + 1) - f(x)

Immediately, we can see that this operator is linear:

\begin{aligned} &\delta(f + g) &= (f+g)(x+1) - (f+g)(x) \\ &= (f(x+1) - f(x)) + (g(x+1)-g(x)) \\ &= (\delta f) + (\delta g) \\ &\delta(\alpha f)(x) &= (\alpha f)(x+1) - (\alpha f)(x) \\ &= \alpha \cdot (f(x+1) - f(x)) \\ &= \alpha (\delta f) \end{aligned}

it obeys a slightly corrupted version of the chain rule,

(fg)' = f' g + g' f

\begin{aligned} &\delta(fg) \\ &= (fg)(x+1) - (fg)(x) \\ &= f(x+1)g(x+1) - f(x)g(x) + 0 \\ &= f(x+1)g(x+1) - f(x)g(x) + [f(x)g(x+1) - f(x)g(x+1)] \\ &= g(x+1)[f(x+1) - f(x)] + f(x)[g(x+1) - g(x)] \\ &= g(x+1)(\delta f)(x) + f(x) (\delta g)(x) \\ &= (S \delta f)(x) + (f \delta g)(x) [(Sh)(x) \equiv h(x+1)] \\ &= (S \delta f + f \delta g)(x) \\ \end{aligned}

We need this new

S

operator to shift the function's input from

x

x+1

§ Falling factorial as polynomials

cool, now that we seem to have a linear derivative operator that behaves roughly sanely, let's test it out! The first reasonable target is a polynomial, so let's try

x^2

\delta(x^2) = (x+1)^2 - x^2 = 2x + 1

This is disappointing, it does not behave very well :( However, there is an analogue that does well. This is the falling factorial , defined as:

x^{(n)} \equiv x(x-1)(x-2)\cdots(x-n+1)

For example:

x^{(0)} = 1 \\ x^{(1)} = x \\ x^{(2)} = x(x-1) \\ x^{(3)} = x(x-1)(x-2) \\

Let's try and apply our discrete difference

\delta

\begin{aligned} &\delta(x^{(2)}) \\ & = (x+1)(x-1+1) - x(x-1) \\ & = (x+1)(x) - x(x-1) \\ & = x*2 = 2x(1) \\ &\delta(x^{(3)}) \\ &= (x+1)(x-1+1)(x-2+1) - x(x-1)(x-2) \\ &= (x+1)(x)(x-1) - x(x-1)(x-2) \\ &= x(x-1)((x+1) - (x-2)) = 3x(x-1) = 3x^{(2)} \\ \end{aligned}

These falling factorials look pretty unnatural though, why do we care? Well, once we build some integral calculus, we can handle our problem of

\sum_{1 \leq i < k} i^2

using this calculus, by rewriting

i^2

in terms of these falling factorials.

§ Sums as discrete integrals

We want to think of

\sum_{0 \leq i < n} f(i)

as the correct variant of

\int_0^n f(i) di

. The most important property of an integral is the fundamental theorem of calculus:

\int_a^b f'(i) di = f(b) - f(a) \mapsto \sum_{a \leq i < b} (\delta f)(i) =?= f(b) - f(a)

we can check if the assertion is true:

\begin{aligned} &\sum_{a \leq i < b} (\delta f)(i) \\ &= [f(a+1) - f(a)] + [f(a+2) - f(a+1)] + [f(a+3)-f(a+2)] + \cdots + [f(b) - f(b-1)] \\ &= f(b) - f(a) \quad \text{(The sum telescopes)} \end{aligned}

Sweet, so we just kicked a theory of calculus of the ground. Let's put this to some use:

§ Gauss' famous formula from discrete calculus

Let's begin by deriving the closed form for

[1\cdot(k-1)]

naturals:

\sum_{0 \leq i < n} i = \sum_{0 \leq i < n} i^{(1)} = i^{(2)}/2 \big|_{0}^n = n(n-1)/2

Let's now derive the closed form form the sum of squares of

[1\cdot(k-1)]

\begin{aligned} &\sum_{0 \leq i < n} i^2 \\ &= \sum_{0 \leq i < n} i*(i-1) + i \\ &= \sum_{0 \leq i < n} i^{(2)} + i^{(1)} \\ &= n^{(3)}/2 + n^{(2)}/2 \\ &= n(n-1)(n-2)/3 + n(n-1)/2 \\ &= n(n-1)(n/3 - 2/3 + 1/2) \\ &= n(n-1)(2n - 1)/6 \\ \end{aligned}

Trying to perform this process in general does beg a question: how do we convert from

x^n

into some combination of rising and falling factorials? It turns out that Stirling Numbers will show up to aid this conversion. But before that, let's see some more connections.

§ $2^x$ as the combinatorial version of $e^x$

We want to find the analogue of the exponential function

e^x

, which satisfies the equation

f'(x) = f(x)

. Setting this up in the discrete case:

\begin{aligned} &d'f(x) = f(x) | f(0) = 1 \\ &f(x+1) - f(x) = f(x) | f(0) = 1 \\ &f(x+1) = 2f(x) | f(0) = 1 \\ &f(n) = 2^n \end{aligned}

What does this buy us? It gives us a nice proof that

\sum_{k} nCk = 2^n

. It proceeds by taking the taylor of

e^x

, "combinatorializing it", and then simplifying:

\begin{aligned} &e^x = \sum_{n=0}^\infty \frac{x^n}{n!} \\ &2^x = \sum_{n=0}^\infty \frac{x^{(n)}}{n!} \\ & = \sum_{n=0}^\infty \frac{x*(x-1)*(x-2)*\cdots(x-n+1)}{n!} \\ & = \sum_{n=0}^\infty \frac{x*(x-1)*(x-2)*\cdots(x-n+1)}{n!} \end{aligned}

§ Harmonic series as the combinatorial version of logarithm

In the continuous case, we have that

\int 1/x = \log x

. In the discrete case, we get

\sum_{i=1}^n 1/x \equiv H_n

, the sum of the first

n

harmonic numbers. This explains why the harmonic numbers keep showing up in different places across discrete math --- We're often trying to integrate some kind of

1/x

quantity. This also may tell us that

H_n

and

2^n

are somewhat "weak inverses" of each other. I haven't thought about this weak inverse thing too much.

§ Stirling numbers of the first kind for convering between polynomials and falling factorials

We can express

x^{(n)} = \sum_{k=0}^n [n, k] x^k

where

[n, k]

are the (yet to be defined) unsigned stirling numbers of the first kind. (aside: I wonder if this can be derived from some kind of mobius inversion).
We begin my in fact defining the stirling numbers of the first kind,

[n, k]

as the coefficients of the rising factorial:

x^{(n)} = x(x+1)(x+2) \dots (x+n-1) = \sum_{i=0}^n [n, i] x^i

The question as a combinatorialist is:

what do the unsigned striling numbers of the first kind, $[n, i]$ , count?

The answer is:

$[n, i]$ counts the number of permutations of $n$ elements with $i$
disjoint cycles.

There is a more evocative definition, which goes like this:

$[n, i]$ counts the number of ways to seat $n$ people at $i$
circular tables, such that no table is left empty.

For example, in the case of the permutations of the set

\{1, 2, 3\}

, we have the permutations:

\begin{aligned} (1)(2)(3) \text{3 cycles} \\ (12)(3) \text{2 cycles} \\ (1)(23) \text{2 cycles} \\ (2)(13) \text{2 cycles} \\ (132) \text{1 cycle} \\ (123) \text{1 cycle} \\ \end{aligned}

So, this gives the counts:

$[3, 1] = 2$
$[3, 2] = 3$
$[3, 3] = 1$

From the seating people perspective, here's how this goes:

$[3, 1]$ is the number of ways to seat 3 people at 1 table. The first person sits at the table in only one way . The second person can sit to the left or to the right of the first person, but this is symmetric: they too have only one way . Now the third person can sit to the left of the first person or to the left of the second person. So there are two ways . In total, there are only two ways. Key idea: On a circular table, it only matters who is to your left, since there is no difference between left and right. The second person has two choices: they
$[3, 3]$ is the number of ways to seat 3 people at 3 tables so that no table is empty. We are forced to place one person per table.
$[3, 2]$ is the number of ways to seat 3 people at 2 tables so that no table is empty. So we will need to have two people on the first table, and then the final personal on the second table. Once we pick which two people sit together in the first table, we are done, because for upto two people, it doesn't matter how they sit at a table, the situation is symmetric.

These stirling numbers satisfy a recurrence:

[n+1, k] = n[n, k] + [n, k-1]

This can be seen combinatorially. If we want the number permutations of

n+1

objects with

k

cycles, we can either:

Take the permutations of $n$ objects with $k-1$ cycles, $[n, k-1]$ , and then insert the new $(n+1)$ th object into a new cycle all by itself. That is, we create a new permutation where $p(n+1) = (n+1)$ .
Take the permutation of $n$ objects with $k$ cycles, and insert this new $(n+1)$ th object into any of the $k$ cycles. If the original permutation had x -p-> y, we can insert this new object as x -p-> * -p->y for any x. Thus, there are $n$ choices of locations to inser * --- as the image of all possible xs. This gives us the $n[n, k]$ term.

Another cute fact of the unsigned stirling numbers of the first kind

[n, k]

is that since permutations are partitioned by their number of cycles, we have:

\sum_{k=1}^n [n, k] = n!