Re: [isabelle] [Agda] Prior work on proof assistant performance / optimization

Thanks!  This is very interesting.  (And thanks all for sources.)

I'm still digesting this, but have one question already:
> Clearly, we should use NbE/environment machines for evaluation, and
implement conversion checking in the semantic domain.

Does this mean that we can't be agnostic about whether or not we're
supporting HoTT, because we either have to pick set theory/irrelevant
equality proofs or cubical (or some other model of HoTT) when normalizing
proofs of equality?

> I'd be very interested in your findings about proof assistant performance.

I'd be happy to share once I have my thesis in a sufficiently ready state!
Broadly, the message of my thesis is that performance of
(dependently-typed) proof assistants / interactive theorem provers is an
interesting area of research sorely in need of systematic study.  (My
thesis will also include some of the research work I've done, which can be
rendered as "how to work around some of the asymptotics in Coq without
needing to change the system", but I think that's a bit less broadly


On Fri, May 8, 2020 at 4:55 AM András Kovács <kovacsandras at>

> Hi Jason,
> You may be interested in my github repos: smalltt
> <>, normalization-bench
> <>. I haven't yet
> updated the smalltt repo, but there's a simplified implementation
> <>
> of its evaluator, which seems to have roughly the same performance but
> which is much simpler to implement.
> The basic idea is that in elaboration there are two primary computational
> tasks, one is conversion checking and the other is generating solutions for
> metavariables. Clearly, we should use NbE/environment machines for
> evaluation, and implement conversion checking in the semantic domain.
> However, when we want to generate meta solutions, we need to compute
> syntactic terms, and vanilla NbE domain only supports quote/readback to
> normal forms. Normal forms are way too big and terrible for this purpose.
> Hence, we extend vanilla NbE domain with lazy non-deterministic choice
> between two or more evaluation strategies. In the simplest case, the point
> of divergence is whether we unfold some class of definitions or not. Then,
> the conversion checking algorithm can choose to take the full-unfolding
> branch, and the quoting operation can choose to take the non-unfolding
> branch. At the same time, we have a great deal of shared computation
> between the two branches; we avoid recomputing many things if we choose to
> look at both branches.
> I believe that a feature like this is absolutely necessary for robust
> performance. Otherwise, we choke on bad asymptotics, which is surprisingly
> common in dependent settings. In Agda and Coq, even something as trivial as
> elaborating a length-indexed vector expression has quadratic complexity in
> the length of the vector.
> It is also extremely important to stick to the spirit of Coquand's
> semantic checking algorithm as much as possible. In summary: core syntax
> should support *no* expensive computation: no substitution, shifting,
> renaming, or other ad-hoc term massaging. Core syntax should be viewed as
> immutable machine code, which supports evaluation into various semantic
> domains, from which sometimes we can read syntax back; this also leaves it
> open to swap out the representation of core syntax to efficient
> alternatives such as bytecode or machine code.
> Only after we get  the above two basic points right, can we start to think
> about more specific and esoteric optimizations. I am skeptical of proposed
> solutions which do not include these. Hash consing has been brought up many
> times, but it is very unsatisfying compared to non-deterministic NbE,
> because of its large constant costs, implementation complexity, and the
> failure to handle sharing loss from beta-redexes in any meaningful way
> (which is the most important source of sharing loss!). I am also skeptical
> of exotic evaluators such as interaction nets and optimal beta reducers;
> there is a good reason that all modern functional languages run on
> environment machines instead of interaction nets.
> If we want to support type classes, then tabled instance resolution
> <> is also a must, otherwise we are
> again smothered by bad asymptotics even in modestly complex class
> hierarchies. This can be viewed as a specific instance of hash-consing (or
> rather  "memoization"), so while I think ubiquitous hash-consing is bad,
> some focused usage can do good.
> Injectivity analysis is another thing which I believe has large potential
> impact. By this I mean checking whether functions are injective up to
> definitional equality, which is decidable, and can be used to more
> precisely optimize unfolding in conversion checking.
> I'd be very interested in your findings about proof assistant performance.
> This has been a topic that I've been working on on-and-off for several
> years. I've recently started to implement a system which I intend to be
> eventually "production strength" and also as fast as possible, and
> naturally I want to incorporate existing performance know-how.
> Jason Gross <jasongross9 at> ezt írta (időpont: 2020. máj. 8., P,
> 0:20):
>> I'm in the process of writing my thesis on proof assistant performance
>> bottlenecks (with a focus on Coq).  I'm having some trouble finding prior
>> work on performance engineering and/or benchmarking of proof assistants
>> (other than the paper on the Lean tactic language (
>>, and figured I'd reach
>> out to these lists.
>> Does anyone have references for prior work on performance analysis or
>> engineering of proof assistants or dependently typed languages?  (Failing
>> that, I'd also be interested in papers about compile-time performance of
>> compilers.)
>> Thanks,
>> Jason
>> _______________________________________________
>> Agda mailing list
>> Agda at

This archive was generated by a fusion of Pipermail (Mailman edition) and MHonArc.