[isabelle] Performance considerations of locales



Hi,

In our applications we have a wide spread use of locales, both on the Isar toplevel
as well as in Isabelle/ML were locales are auto generated and certain definitions are made and 
theorems are proved within these locales.

Recently we hit the tipping point, where actually more cpu-time is used for the 
declaration of locales compared to the actual proof load that is performed within the locales.
I did some analysis to get a better understanding what actually happens and
want to initiate a discussion on how this situation might be improved in the
future.

First some high level description of our use of locales which triggered the tipping point:

1. Locales are used to structure and modularize our work. 
2. We stay abstract: Our clear focus is on declaring new
  locales on the basis of existing locales rather then interpreting locales.
3. Most of the time an abstract entity within a locale aka fixes x::'a stays the same in
  the complete locale hierarchy. It is neither renamed nor is it's type changed when being imported to
  another locale. It might be shared among multiple locales.
4. Within a locale we make definitions and proof theorems which are noted. 
  A typical right hand side (rhs) of a definition can be a sizeable term containing polymorphic constants. 
  For example think of a sequence of about 100 monadic binds in a state-monad.
5. We use attributes on definitions as well as theorems to maintain our data for proof automation. 

Reading and digesting the excellent background papers on locales [1] and local theories [2] and 
studying and profiling the actual code, I come to the conclusion that the actual 
hot-spot is the omnipresent instantiation of
locales. This happens in the roundup of locale expressions, both when defining new locales as well as 
when entering the context of a locale. Profiling highlights low-level functions like Term.fold_atyps,
Term_Subst.generalizeT_same, Term_Subst.inst_same...

This is not surprising as the implementation of locales and local theories always starts from the
global theory where all definitions / theorems (notes) are generalised with the 'fixes' being
schematic variables and the 'assumes' being explicit preconditions. When entering a target context
the (term and type) variables get instantiated and the preconditions become hypothetical assumptions.

This process becomes so costly in our application because of (4) and (5) and is especially unsatisfying
for the user because of (3): From a user perspective nothing has to be instantiated.
It turns out, that by using attributes (5) the optimisation of "lazy notes" is unfortunately 
ineffective (in contrast a theorem with no attributes is represented as a lazy value and hence 
only instantiated when it is actually used, this speeds up entering a locale or composing locales).

I guess there is a huge design space to improve the situation. Here are some of my thoughts:

I. Keep the terms small
a) Manual workaround: Instead of a local definition we could manually make a global definition of the
   large rhs, and then only make a local definition having the (folded) global one as rhs. 
   This is more or less the first half of what a local definition does anyway.
b) Infrastructure: have an option to the local definition that only introduces the abbreviation 
   but does not note the instantiated equation in the target context.
II. Improve the use case for the "canonical" locale instances (points 2 + 3) by some 
   sort of caching or pre-evaluated instances.
III. Extending the lazy notes idea to "lazy attributes", giving user level control to what is already relevant
    for the declaration of new locales and what is only relevant for doing proofs later on.
IV. Make instantiation cheaper, e.g. by changing the low-level representation of cterms (like 
   environment + term-skeleton instead of a fully instantiated term; ML antiquotations to
   match on cterms and construct cterms might bring the convenience of programming with 
   terms to cterms, without having to fully convert between terms and cterms all the time).

Manually keeping terms small (I.a) or avoiding attributes, feels like breaking the
nice and convenient abstraction and falling back to bare-metal error prone tool development. 
I hope that some (maybe combination) of the ideas above or some other ideas materialise as a rescue.

Find attached three benchmark theories illustrating some aspects:

Benchmark_poly.thy: large terms within locale using polymorphic constants
Benchmark_mono.thy: monomorphic variant
Benchmark_global.thy: Workaround (I.a), keeping terms in definitions and theorems (notes) small.


[1] Clemens Ballarin. Locales: a module system for mathematical theories. Journal of Automated Reasoning, 52(2):123–153, 2014.
[2] Florian Haftmann and Makarius Wenzel. Local theory specifications in Isabelle/Isar. In Stefano Berardi, 
   Ferruccio Damiani, and Ugo de Liguoro, editors, Types for Proofs and Programs, TYPES 2008, 
   volume 5497 of Lecture Notes in Computer Science. Springer, 2009.



Attachment: Benchmark_global.thy
Description: Binary data

Attachment: Benchmark_mono.thy
Description: Binary data

Attachment: Benchmark_poly.thy
Description: Binary data




Regards,
Norbert

--

Norbert Schirmer (nschirmer at apple.com)
 SEG Formal Verification
Regards,
Norbert

--

Norbert Schirmer (nschirmer at apple.com)
 SEG Formal Verification





This archive was generated by a fusion of Pipermail (Mailman edition) and MHonArc.