Re: [isabelle] lexical matters
on 19/1/11 4:02 AM, Gerwin Klein <gerwin.klein at nicta.com.au> wrote:
>> In the large, safety-critical certification projects I have been involved
>> the evaluators do the best
>> they can with the tools they have available. Proof checkers with
>> and highly trustworthy output is a luxury not available to them.
> Depends on the project, of course, and what level of certification.
> Currently not that many certifications with interactive formal proof
> but some do (ACL2 has been used quite a bit). As you say, evaluators need
> take the best that is available. LCF provers are a long way ahead of what
> they usually have to deal with. That's why I don't think it's a problem in
So it's ok because LCF systems don't tend to get used! I'm trying to enable
LCF system to get used in practice. This can potentially greatly increase
the assurance in the project.
>> Are we to ban non-trivial ML in the process of producing a proof?
This reduces what can be done in the project.
> If you're really concerned about malicious users, why not? Usual Isabelle
> interaction makes no use of ML. Very few of the AFP entries, many of them
> substantial, do anything remotely like that.
Yes, but I've been involved in large safety-critical projects where use of
ML has been a practical necessity. Formal methods is generally currently
extremely expensive if it involves formal proof (but not in the projects
where I have been involved with).
>> The contracted, safety-critical proof work I have been involved with
>> certainly have been completely infeasible without writing large amounts
>> ML program to support my proof work.
> In Isabelle? HOL4 or HOL-light, sure, that's their main interaction mode,
> but Isabelle hasn't had the ML-as-user-interface paradigm for quite a few
Yes, so Isabelle would be safer in this respect because the proof scripts
aren't even in ML. But to enable cost-effective large formal verification
projects involving formal proof, I think allowing the contractor to use the
power of ML to overcome practical problems, specific to the particular
project, would be an extremely useful thing. I'm talking about bespoke
automation and bespoke semi-automation. Now it's always possible to do a
bit of bespoke pre-processing and still use a restricted, non-ML paradigm,
but this will often have its own integrity risks and is in any case is
fundamentally limited. We need to give contractors the full power of ML to
get the costs of full formal verification down. But we shouldn't trust them
not to be malicious!
> I'm not saying that you'll never want to use any ML for bigger advanced
> projects, but these ML bits are a far way off from complex ML programs.
> Compared to the effort that should go into validating models and
> checking small amounts of proof method ML code for pretty printing cheat
> attempts would be trivial (there's no reason to have any pretty printing
> code there at all). There's no reason evaluators should have to accept
> amounts of complex ML code.
I've been involved with large safety-critical projects that use 20,000 lines
of bespoke ML.
>> It is far better for the auditor if they can treat the ML code that
>> establishes the final theorem as some sort of black box. To do this the
>> theorem prover needs certain basic properties, including a trustworthy
> Well, there is one, you just don't seem to like it: inspecting the term on
> the ML level. As Michael says, it's as least as good as Lisp
Good concrete-syntax printers make it much easier to read. When hundreds of
lines of specification need to be reviewed, concrete syntax is a practical
> There's a difference between accidentally creating something misleading
> (reasonably hard to do if you use the normal interface) and a malicious
> with motivation and criminal energy. If you want to guard against the
> latter, low-tech plain terms and external proof checking surely brings
> assurance than further infrastructure.
I find that analysts just make mistakes, especially when things are being
done on the industrial scale, and especially when there is a team of
analysts. Now occasionally these mistakes will surface in really bizarre
ways, and things that one would think "this would never happen in practice"
actually do happen. I regret not keeping a careful log of strange problems.
But problems with the pretty printer have certainly caused all sorts of
problems, and, I think, some connected with the soundness of the analysis
being undertaken. These problems have often been quite subtle.
> I guess we're all mostly of the same opinion fundamentally anyway, just
> different views on where to put resources and emphasis.
>> And this is the problem that I am advocating needs a tool that properly
>> the process - the process of determining that the model and final theorem
>> are right (as well as that the final theorem has actually been proved).
>> This process involves using the pretty printer.
> Checking that the proof shows what you think it does involves the pretty
> printer only at a very shallow level. It's not the case that we can't
> the Isabelle pretty printer to support development and that it will
> magically show us wrong terms.
The complex Isabelle derived code, outside the relatively small core of code
implementing crucial trusted components such as the inference kernel and the
pretty printer, may be susceptible to constructing expressions that end up
exploiting problems with Isabelle's pretty printer. I have found that such
things are easy to subtly create maliciously (but of course if the user
cannot use ML, I think we can assume the developers won't do this). But I
have also found that in practice on large projects, subtle problems with
tools, including theorem provers, just happen accidentally.
For example (forgive me if my lack of deep understanding of the Isabelle
core means I am wrong here), but presumably it is quite feasible, in
principle, that some complex bit of derived Isablle code occasionally
constructs terms that involve overloaded variables. This is certainly true
in the "3 HOL" systems (i.e. the 4 HOL systems minus HOL Zero :). So nasty
terms that involve overloaded varaibles could "magically" get innocently
constructed by well-meaning source code developed by well-meaning Isabelle
> I've often seen and written theorems that didn't mean what I thought they
> meant. It's so far never been the fault of the pretty printer. It can be,
> because you'd be able to trick it with some effort, but getting rid of
> possibility is about assurance, not about finding the right proof
> It can have different solutions.
Yes, the risk is that the wrong theorems are being proved or that things are
being given the wrong definitions, and this is why we need proof auditing,
and this is why we need good concrete-syntax pretty printers, to enable the
auditor to effectively review things. But I find I'm repeating myself
This archive was generated by a fusion of
Pipermail (Mailman edition) and