Re: [isabelle] lexical matters



Hi everyone, shall we make this the last e-mail on this subject open to the
list?  I understand that this thread is perhaps not of the question/answer
type that most Isabelle users would be interested in.  Why don't people
e-mail me if they want to see or participate in any of the continuations and
we'll keep it private.  Unless there are lots of objections to keeping it
private that is...

on 19/1/11 9:56 AM, Alexander Krauss <krauss at in.tum.de> wrote:

> Hi Mark,
>
> This discussion is a bit strange. It seems this is mostly due to a
> conflation of at least three very different functionalities that you
> expect from a "theorem prover":
>
> (a) Checking Proofs -- ...
>
> (b) Developing Proofs -- ...
>
> (c) Auditing Definitions and Theorem Statements -- ...
>
> I hereby make the bold claim that any system trying to do any two of
> these things at the same time is bound to make some lousy compromises
> that aren't acceptable if you are really serious.
>
> ...

I think it's helpful to the discussion to break it down into different
activities like you have.  You are right to differentiate between checking
proofs and developing proofs.  I have always viewed auditing definitions and
theorem statements as part of the process of checking proofs, but I suppose
you could split it off as a different activity.

But I would say that there is absolutely no conflict in practice in catering
for (a) and (c) together in the same system, and furthermore that they
naturally go hand-in-hand.  The inference kernel does (a), and is
architecturally separate from the pretty printer which supports (c).  Both
have trustworthiness as their top priority.

There is some conflict between (a)/(c) and (b).  So both (a) and (c) ideally
involve using a system where trustworthiness is the top priority, together
with sufficient efficiency to enable (a) and with certain aspects of
usability to support doing (c) effectively (such as good pretty-printing of
definitions and theorems, and good user feedback for certain basic
operations).  To achieve the trustworthiness, the source code for the
trusted parts is ideally written with the utmost simplicity and clarity.
But for (b), its top priorities are general usability and efficiency, with
trustworthiness having less importance.  There is a conflict between general
usability and clarity of the inference kernel, because things like
hierarchical theories, for example, can help usability but (at least
traditionally) involve greatly complicating the inference kernel.  Also, to
pursue efficiency for (b), the inference kernel is typically made
significantly bigger and more complicated.

So yes, HOL Zero was written to support (a)/(c) and avoid the conflicts with
(b).  But no, I don't see there being any significant compromises in the HOL
Zero implementation.  Take a look for yourself and see!  And then compare it
with HOL4, ProofPower HOL and Isabelle.  (I can point out the equivalent
parts of each system if you are really interested in doing this.)

HOL Light is an interesting case.  It's simplicity (and the simplicity of
it's embryonic forerunner - the GTT system implemented by Konrad and John)
is the inspiration for HOL Zero.  I would suggest that HOL Light sits
between (a) and (b) with some slightly uncomfortable compromises (its lack
of theory hierarchy, for example, arguably makes it less usable), but that
at the same time does a surprisingly good job.  The inference kernel has
only 10 inference rules and is very simple, but it is not at all unusably
slow.  And Flyspeck is showing that HOL Light is up to the task of being
used on big projects.  But anyway, I saw HOL Light as still unsatisfactory,
if only due to its pretty printer making it inappropriate for the (c) role,
and reluctantly decided that a new system was required to fill the gap.

I think HOL4, ProofPower and Isabelle are aiming for (b) and get the balance
about right.  Their use of an LCF-style architecture gives good assurance
but without conflicting with (b) in my opinion.  I think the balance is a
bit wrong as regards to their pretty printers (because the problems can be
relatively easily addressed without compromising on (b), and furthermore
this would actually help (b)), and this is what this thread has become
about.

> The issues about someone maliciously attacking the kernel etc. are
> orthogonal and probably have standard solutions. Operating systems and
> security people can solve this as soon as it is important enough for
> someone to pay the price. But logicians are not the right people to do
this.

They are not orthogonal in my opinion, because the LCF-style architecture
itself delivers trustworthiness to protect against both the malicious and
the accidental.  I think the orthogonality comes in when differentiating
between trustworthiness within the scope of the ML program implementing the
theorem prover and trustworthiness outside its scope (e.g. bugs in the ML
interpreter, operating system interrupts, etc, etc).  But I agree that we
should get the security experts in to help with the latter.

> Mark Adams wrote:
>> Not so simple that existing theorem provers have trusted pretty
>> printers (other than HOL Zero)....
>
> previously, Mark Adams wrote:
>> but just a healthy degree of concern, especially when the problems
>> can be easily addressed.
>
> So are you claiming that is it easily addressed or that it isn't???
> Maybe your reasoning needs to be checked by a trusted kernel?

:-)

I'm claiming that it's easy.  My suggestion that it was "not so simple" was
just to emphasise that nothing much has been done about these things until
now.

> Anyway, what properties must a pretty printer have in order to be
> trustworthy? Is there anything formal to be said here apart from Freek
> Wiedijk's Pollack-consistency? Or is it just defined in terms of the
> attacks you happen to imagine right now? It would be interesting to read
> some details here instead of just a claim of how important it is.

I've spent a long time thinking about it and independently came up with the
tightest of Freek's "Pollack consistency" concepts.  I called it
"parser/printer completeness".  But there's more to this than Freek's paper
because he doesn't cover "printing soundness" - i.e. printing what the user
expects (it's misleading to print an "and" when internally it's an "or").

Mark.





This archive was generated by a fusion of Pipermail (Mailman edition) and MHonArc.