Re: [isabelle] Isabelle Foundation & Certification

Hi Makarius,

Many thanks for your comments, and for the acceptance decision which will make the rebuttal phase very relaxed.

I'll send the rebuttal comments tomorrow.

All the best,

-----Original Message-----
From: cl-isabelle-users-bounces at [mailto:cl-isabelle-users-bounces at] On Behalf Of Makarius
Sent: 28 September 2015 15:56
To: cl-isabelle-users at
Subject: Re: [isabelle] Isabelle Foundation & Certification

Here is now my virtual review of the ITP paper behind this thread. It came out a bit longer than anticipated, although I included only half of the thoughts that came to me, while reading it 3 times.


Title: A Consistent Foundation for Isabelle/HOL
Authors: OndÅej KunÄar, Andrei Popescu

Overall evaluation: accept
Reviewer's confidence: high


The paper revisits the question how overloaded definitions (of constants and type constructores) may be understood within the HOL logic. This work consists of three parts:

   (1) Technical treatment of the HOL logic, with a notion of "definitional
   theory" and its well-formedness, interpretation of this extended HOL logic
   in a substantially reformed version of the original HOL88 interpretation
   due to A. Pitts.

   (2) A proposed change to the Isabelle2014 code base, to upgrade its
   'typedef' mechanism to perform the additional well-formedness checks
   described in the paper. The text refers to which provides some
   changesets and one page of explanations how this fits into the setup of
   the paper.

   (3) General motivation and discussion of perceived problems of
   Isabelle/HOL, which the paper sets out to resolve. This part is apt to
   cause confusion in various ways, both to users and non-users of Isabelle.
   The style of presentation considerably subtracts from this otherwise fine

Part (1)

These are sections 3, 4, 5 in the paper. Section 5 is the main contribution.
The classic HOL logic of HOL88 (due to M. Gordon and semantics by A. Pitts) is taken as starting point. There is a nice explanation in 4.3, why overloaded definitions of constants or types does not work out in the interpretation of A. Pitts (HOL88 book from 1993). Thus a new interpretation is introduced as follows:

   * Types are not immediately interpreted away into sets, but preserved as
   syntactic entities to keep sufficient structure that allows well-founded
   recursion. The paper calls this "a natural syntactic-semantic blend, which
   avoids stunt performance in set theory".

   * Type interpretation starts out on ground types. Polymorphism is
   interpreted only in the final stage, treating it as strictly schematic
   over all ground interpretations.

   * Models for ground interpretation are constructed in stages, following
   the well-founded dependency relation of the given collection of
   definitions. Here types and constants are interpreted simultaneously for
   each stage, while in the HOL88 semantics by A. Pitts all types are
   interpreted before all constants.

This model construction is fit together with other recent work by the first author [19], opening the perspective of fully checked definitional theories in Isabelle/HOL. More about this in part (2).

Taking the little space of a paper in LNCS proceedings into account, the technical treatment of the problem is worked out very carefully. Numerous fine points and additional side-conditions are included in the considerations.

Nonetheless, there is a potential for confusion due to slight deviation of terminology and concepts, compared to papers on the subject e.g. by Wenzel/Haftmann and the Isabelle documentation (isar-ref manual).

In particular: Section 3.4 defines a "definitional theory" as a set of overloaded definitions (for constants or types) such that certain *local* syntactic conditions hold. Section 4.2 defines a "well-formed definitional theory" as a definitional theory with well-founded dependency relation as
*global* condition. Section 4.5 establishes the main theorem that a well-formed definitional theory is consistent -- by demonstrating that it has a model in the above sense, and concluding that False cannot be derived in the inference system -- due to soundness of the interpretation.

In contrast, the specifications of a "definitional theory" of the paper would be called in the Isabelle documentation "unchecked definitions", "definitional axioms", or "axiomatizations", to emphasize the missing aspect of global well-formedness that this paper is mainly about. A "well-formed definitional theory" of the paper would be just called "definitional theory"
in Isabelle terminology, and much stronger properties intended than just consistency. Instead of merely ensuring the existence of a model, the requirement is to preserve derivability of old propositions precisely, and allow to reduce new derivations of new propositions into old derivations by "expanding" definitions in some way. If and how this works depends on fine-points of the underlying definitional mechanisms: for overloaded constant definitions in Isabelle/Pure this works [13, 35], but for type definitions the situation is a-priori quite different, e.g. see [13] section 4.3. The deeper reason for this are the distinctive categories of terms and types in HOL, and the lack of equational reasoning on types.

A suitable semantic treatment is required to re-unify the view on terms and types, as is done in the present paper. But this leads to weaker results.
The paper does briefly mention "a suitable notion of conservativeness" as future work, but due to lack of further explanations, many readers will probably miss key point behind this.

There is another aspect of "consistency" versus "conservativity" that is easily missed in the paper. Section 3.5 states "Isabelle/HOL theory development proceeds by: 1 ... 2 ... 3 ...". The three kinds of operations ensure that every stage of theory development is "definitional" in the sense of the paper -- and "well-formedness" can be established later. But what happens when the user starts adding non-definitional axioms in between?
Strictly speaking, the main result does not apply and the whole theory looses its good definitional properties.

In contrast, the traditional explanation of (constant) definitions in Isabelle is more modular in that respect: given an arbitrary base theory --- one that somehow makes sense to the user in a particular application, even after making unchecked axiomatizations --- definitions merely introduce names for existing entities and thus preserve the key properties of the base theory. This agnostic approach of "definitional theory extensions"
(occording to Isabelle terminolgy) is particularly important from the perspective of the Isabelle/Pure framework, where the interpretation of object-logics is not yet known.

For HOL applications one might argue that genuine axiomatic extensions are not done in practice these days. Nonetheless some exotic applications like HOLZF (by Obua) do exist. Without stronger results under which conditions typedefs "make sense" or "are OK" (using words from the paper), such ambitious users would have to revisit the whole model theory of HOL again.

Part (2)

The patch to Isabelle2014 is rather minimal: the existing infrastructure to check overloaded constant definitions (going back to Isabelle2007, and slightly improved by O. Kuncar [19] is generalized to cover type definitions as well. So Isabelle/Pure is upgraded to provide a general service for "definitional specification items" that are identified in the name space for constants or types, but without looking at actual content.

While the explanations for the patch states "the situation is from a technical (implementation) point view a little bit more complicated", the outcome is actually simpler than in the paper. Being forced to strip away accidental aspects of the HOL object-logic does occasionally have advantages.

For example, the delicate notion of "build-in types" vs. "non-built-in types" is absent in the implementation. The notes on the patch provide some explanations, why it works out nonetheless. It would be nice to see this refinement applied to the main work of part (1), to trim it further down to the very essence of symbolic specifications with schematic polymorphism and overloading.

The proposed change to Isabelle2014 is called "correction patch", as if something would be broken that is fixed by the change. This misunderstanding leads directly into the general discussion of this work below.

Part (3)

These are sections 1 "Introduction", section 2 "Related Work", section 6 "Conclusion", i.e. the important pieces that put the technical contribution into proper perspective. This is what most people read, and what attendants of a conference presentation who are busy with their e-mails or smart-phone usually take home. Unfortunately, the story being told here do not quite fit to Isabelle.

The misunderstanding already starts in the first paragraph, where Isabelle/HOL is included into the "umbrella term" of "HOL-based provers".
HOL4, HOL-Light, ProofPower, HOL Zero are fine systems, but Isabelle/HOL is not as closely related to them as the "HOL" name might suggest. In many respects Isabelle is actually closer to Coq.

Technically, the key misunderstanding is the role of "the kernel" (in the words of the paper). HOL88 as the predecessor of all the other HOL systems pioneered an add-on to the original LCF kernel design to have checked definitions as primitive rules. In contrast, Isabelle is closer to LCF in separating logical inferences and a-priori unchecked (axiomatic) theory
specifications: there is no notion of definitions in the inference kernel of Isabelle. Since this is a bit impractical for big applications, more and more checks on theory content have been added over the years (e.g. 2004, 2005, 2007, 2010, 2014), but that process was never formally closed. And this is what the paper is actually about: complete emulation of HOL88-style definitions in Isabelle/HOL (by providing additional services in the Isabelle/Pure framework).

We can now revisit the critical examples in the light of the Isabelle architecture and tradition, instead of the one of HOL88.

Example 2 is presented as the main failure point to motivate this work, but it is merely an illustration what the isar-ref manual, section 11.5 "Typedef axiomatization" covers abstractly: 'typedef' is a certain axiom scheme of Isabelle/HOL with some local sanity checks. There is no claim of it being "definitional" (in Isabelle terminology, "well-formed definitional" in the paper). So the proof of False is not a problem of the Isabelle inference kernel, but the consequence of producing a malformed axiomatic theory.

A legitimate discussion would be about the chances of users making such mistakes in practice. The results of the present paper certainly improve the situation, but after applying the "correction patch" (i.e. the additional feature of fully-checked typedefs) no known applications of Isabelle + AFP are rejected. This is good, but it also indicates relatively low practical impact.

Other interesting information that is missing: How often do type definitions depend on open parameters due to overloading, notably type class parameters?
The original concept of overloading in Isabelle did not expect types to depend on unresolved overloading at all: such closed type definitions could be interpreted within the original setting of HOL88, after expansion of all constant definitions. In recent years such applications have shown up nonetheless, e.g. a type of polynomials that refers to a generic constant "zero" (see src/HOL/Library/Polynominal.thy e.g. in Isabelle2014 or Isabelle2015). This indicates that the extra expressive power of typedefs over type class parameters occasionally matters.

It is good that the new explanation of overloading treats constants and types uniformly, but there is also an investment into a new model theory that deviates from the other HOL systems. I actually like the new interpretation better than the one by A. Pitts, but without the mainstream HOLs following this upgrade, a translation of Isabelle/HOL theories into e.g. HOL Zero or OpenTheory for separate checking would mean that overloaded typedefs are axiomatic specifications without any well-formedness checks at all.

The older Example 3 is presented somewhat dramatically "when Obua took over the, he found that the overloaded were almost completely unchecked". Here is a quote from the relevant documentation (Isabelle2004 or Isabelle2005, "ref"
manual, section 6.1.1 "Definitions"):

   Definitions are intended to express abbreviations. The simplest form of a
   definition is f == t, where f is a constant. [...]

   Isabelle makes the following checks on definitions: [...]

   These checks are intended to catch the sort of errors that might be made
   accidentally. Misspellings, for instance, might result in additional
   variables appearing on the right-hand side. More elaborate checks could be
   made, but the cost might be overly strict rules on declaration order, etc.

Why do users and especially power users tend to ignore documentation? There are many more fine points about Isabelle that could be misunderstood in similar ways; and not all of them are as explicitly documented.

A potential explanation why users of Isabelle might consider the system to be "absolutely right" and "unbreakable" is that in everyday life serious incidents are never encountered, even after several years of intensive use.
Thus the sense for the degree of reliability is lost. We desparately need for fine-grained assesment of the overall quality of ITP systems than the "Consistency Club" versus "Inconsistency Club" in the paper.

My impression is that the "Consistency Club" is presently empty: none of the major ITP systems are absolutely well-defined and correctly implemented in every respect. Note that HOL Light is often quoted is being really small and formally well-understood, but it resides in the unsafe environment of the OCaml toplevel: users need to refrain from playing dirty tricks in ML.

Formalizations of complete systems like CakeML + classic HOL look very promising, but the community is not there yet. For big environments like Isabelle or Coq full formalization appears futile to me. A better strategy is to connect to some smaller and fully-proven proof checker. For the present paper, this means the authors either need to formalize the extended model theory of HOL with full overloading themselves, or convince the colleagues [18] to do such an upgrade of their formalization.

Then we only need OpenTheory to follow suite, and a proper Isabelle/OpenTheory implementation as constructive proof that Isabelle/HOL actually fits under the same "umbrella" of HOL-based systems. Then we could proceed to the next order of magnitude in overally reliability.

This archive was generated by a fusion of Pipermail (Mailman edition) and MHonArc.