Re: [isabelle] soundness of Isabelle/HOL
I would summarise the potential soundness-related vulnerabilities of
LCF-style systems as follows:
1. Small risk that the inference kernel still has problems with the
design/implementation of its logic. This might include:
- a) errors in tricky term operations such as substitution (most or
perhaps every HOL system has had problems with this at some stage);
- b) errors in the inference rules or axiom/definition commands (this risk
is amplified if the kernel implements various derivable inference rules as
2. Vulnerabilities in the implementation of the LCF-style architecture.
This might include:
- a) backdoors to creating theorems, thus sidestepping the logic (e.g. by
importing theorems from disk purely as statements without proof);
- b) not keeping track of theory (e.g. not maintaining a list of the
definitions of each constant in the theory, thus making it difficult to
review definitions and what has been assumed);
- c) ability to overwrite crucial ML values and/or abstract datatypes
(such as the list of axioms or the datatype for theorems);
- d) vulnerability from aspects of the implementation language (e.g.
OCaml's mutable strings, OCaml's Obj.magic) .
3. Problems associated with the pretty printer (and arguably the parser)
fooling the user about what has been proved. This might include:
- a) ambiguously displaying theorems (such as not annotating theorems with
their types, or not distinguishing between overloaded entities). This has
become know recently as Pollack Inconsistency;
- b) vulnerability to having the pretty printer overwritten or extended in
a problematic way.
My HOL Zero system addresses most of these vulnerabilities, although there
is little that can be done about some of the 2c), 2d) and 3b)
vulnerabilities if an ML toplevel is allowed. It is implemented in OCaml,
but does manage to avoid the mutable strings problem. I will be porting it
to SML sometime soon hopefully, which is a fundamentally more secure
language. I give out $100 bounty rewards for anyone uncovering new
unsoundness-related vulnerabilities. The website includes a list of all
soundness-related vulnerabilities that have been uncovered (even those found
by myself). See the HOL Zero page at:
The inference kernels of most other LCF-style systems are actually a bit
complex, involving a few thousand lines of ML. As of 2012, the risks of
vulnerability 1 are probably quite small because they get used so much, but
there have been some unsoundnesses uncovered in the 1990s, and I would very
much like developers to clearly publish all discovered soundness-related
vulnerabilities in their systems.
HOL Light manages to keep the inference kernel down to about 500 lines
(depending on exactly how you count the lines) by various measures,
including: not implementing derivable inference rules in the kernel (other
than TRANS) and having a monolithic theory instead of maintaining a theory
hierarchy. John Harrison has proved the consistency of the HOL Light
kernel, but this just addresses vulnerability 1. Indeed I can prove false
without leaving a trace in the HOL Light state in about 5 lines of ML!
I'm fairly sure that alll existing HOL systems (other than HOL Zero) still
have various known vulnerabilities in their pretty printer, although some
have been addressed.
on 30/1/12 10:00 AM, Matthias Schmalz <Matthias.Schmalz at inf.ethz.ch> wrote:
> Out of curiosity and for my thesis introduction, I have the following
> I am wondering whether there is a theorem prover out there that gives
> stronger soundness guarantees than Isabelle/HOL and whether there is
> empirical evidence showing that the difference is practically relevant.
> I would also like to know when the last unsoundness bug in Isabelle's
> inference core has been observed.
> I already know that Isabelle follows the LCF approach and that HOL is
> built from a modest number of axioms using conservative extension
> methods. It is therefore very likely that proofs by Isabelle are
> correct. I also know that this soundness guarantee is restricted to the
> inference core; for example, nothing prevents users from configuring
> the parser to parse "False" as "True" and therefore giving the
> impression that "False" can be proved. (And of course, soundness
> rests on the assumption that compiler, ML libraries, operating system,
> and hardware behave correctly.)
> Many thanks in advance,
This archive was generated by a fusion of
Pipermail (Mailman edition) and