§ Mutual recursion elaboration in Lean
Lean has four backends for elaborating mutual definitions.
- Lean, given a mutual def block, can compile to (1) partial, which is just an opaque blob in the kernel, (2) primitive recursion on an inductive type via
recOn
, (3) well founded induction via WF
, and (4) brecOn
+ casesOn
, which allows us to split the recursion into the pattern matching part (casesOn) and the recursion part (brecOn).
- (2)
recOn
is primitive recursion, and is synthesized by the kernel for every inductive declaration. It is often complicated to elaborate pattern matching syntax into a primitive recursor, and is a research question for the mathematically correct, complete solution which handles all cases. This is the lowest level of recursion in Lean, and supports good definitional equality. However, the code generator does not currently generate code for recOn
of mutal inductives, and thus cannot be executed. When working with objects that live in Type
, it is a good idea to use recOn
right now, since (a) it reduces correctly, and (b) has no computational content.
- (3)
WF
is well founded recursion on a terminating metric, which allows one to express functions more easily than primitive recursion. Currently, mutual recursion elaborates into WF
. The drawback is that it has poor definitional equality, and thus breaks a lot of convenient reasoning. It has support in the code generator, since the code generator supports evaluating recOn
of non-mutual definitions (which WF
is).
- As an example of where
WF
is more convenient than recOn
, think of ackermann in first order logic. It's not primitive recursive, but does terminate by well founded induction on the lexicographic metric of the naturals. Another example is the hydra tree, which is a crazy game which is known to be finite, but any proof system that can prove the game is finite has at least as much proof strength as PA (Kirby and Paris).
- (4)
brec
+ casesOn
which is used to elaborate inductive predicates. brec
is bounded recursion, which allows using k-inductoin: using inductive hypothesis upto k children behind you. Useful for encoding things like fibonacci, where for a S(S(n)) depends on S(n) and n (2-induction). This way of elaborating mutual inductives splits the matching part ( casesOn
) from the indction part ( brecOn
), and is thus more convenient to elaborate into than a lower level recOn
. There are some bugs luring in the lean elaborator for inductive predicates, so this is not fully figured out.
- Coq gets away with this stuff, because coq has
fix
+ match
in the kernel, and they have guardedness checks in the kernel which checks that the fix is structurally decreasing or whatever. This is complicated, and has led to many soundness bugs in the kernel. Thus, Lean wishes to avoid this.
- Thus, in an ideal world, we would improve the elaborator to elaborate everything into
rec
, and would teach the code generator how to code generate mutual rec
.
§ Simp Bottlenecks
Benchmarking simp with perf showed us that the bottleneck in one example
was in congr
, which recursively calls simp
and dsimp
(dsimp is a
variant of simp which preserves definitional equality). This needs to be
investigated further.
Another bottleneck could be that simp processes bottom-up. This can lead
to quadratic behaviour on certain tests. For example, consider:
(not (and A B) = (or (not A) (not B)
We denote the currently processed node with square brackets [.]
If we
proceed top-down, see that we would need a quadratic number of steps,
because we need a linear number of steps to reach the top from the
bottom, where we push down the not
. We must repeat this till fixpoint.
(not (and (and a b ) c ))
(not (and (and a b ) [c]))
(not (and (and a [b]) c))
(not (and (and [a] b) c))
(not (and [and a b] c))
(not [and (and a b) c])
[not (and (and a b) c]
;; TRANSFORM=>
(or (not (and a b) (not c))
;; ...
§ Simp lemma generation
If we define functions in a mutual def block, and we tag these functions
as simp
, then simp must generate simp lemmas. If we have a definition
of the form:
inductive X where
| X1 | X2 .. | Xn
def foo: X -> X -> Bool
| X1, _ => True
| _, X2 => False
the theorems will be:
theorem foo.simp1 (x x': X) (h: x = X1): foo x x' = True.
theorem foo.simp2 (x x': X) (h: x /= X1) (h': x' = X2): foo x x' = False.
This could be very expensive in case we have complicated mutual
definitions, since Lean can blow up if we have many inductives.