[isabelle] The definitional principles of HOL and equivalent mechanisms in Q0/R0



Hi John,

Thanks for the additional information. I will need some time to digest the 
papers you refer to.


Concerning the definitional principles of HOL, already in the past I have spent 
some thoughts about how to express the same in Q0/R0, and consider including 
some remarks in my R0 publication.

To avoid misunderstandings, let me refer to the extensions of the logic as 
discussed in sections 2.5.1 - 2.5.4 of the current HOL4 manual at:
	http://freefr.dl.sourceforge.net/project/hol/hol/kananaskis-10/kananaskis-10-logic.pdf


2.5.1 Extension by constant definition (John: "new_definition")

In the R0 implementation, definitions are not implemented as part of the 
logical kernel, but are labels attached to well-formed formulae (wffs) used for 
parsing and printing. I consider definitions simply as abbreviations, hence 
there is no need to make them part of the logical reasoning by introducing 
theorems for them. Consequently, definitions (abbreviations) can be removed and 
introduced again at any time without consequences for the reasoning itself, and 
without the danger of causing an inconsistency. (The wff to be abbreviated 
doesn't even need to be a closed term.) See the following R0 log, where 
comments start with (##) and output with (#):

## print universal quantifier ALL (A)
## (short and long - fully typed - version)
A
# wff      29 :  [\t.[\p.((=_[\x.T])_p)]]                   :=  A
# w typd   29 :  
[\t{^}{^}.[\p{{o,t{^}}}{{o,t{^}}}.((={{{o,{o,t{^}}},{o,t{^}}}}_[\x{t{^}}{t{^}}.T
{o}]{{o,t{^}}}){{o,{o,t{^}}}}_p{{o,t{^}}}{{o,t{^}}}){o}]{{o,{o,t{^}}}}]
## for the formatted version, see p. 356 at:
## http://www.kenkubota.de/files/R0_draft_excerpt_with_definitions_and_groups.pdf

## print Axiom 1 (Law of Excluded Middle) - with defined ALL
%A1
#                ((=_((&_(g_T))_(g_F)))_((A_o)_[\x.(g_x)]))
[...]
## for the formatted version, see p. 351 at:
## http://www.kenkubota.de/files/R0_draft_excerpt_with_axioms_and_proof_of_C_function.pdf
## or Andrews' version for Q0 at:
## http://plato.stanford.edu/entries/type-theory-church/#ForBasEqu

## remove definition of universal quantifier ALL (A)
:= A

## print Axiom 1 (Law of Excluded Middle) - without defined ALL
%A1
#                ((=_((&_(g_T))_(g_F)))_(([\t.[\p.((=_[\x.T])_p)]]_o)_[\x.(g_x)]))
[...]

## define universal quantifier again (with type information)
:= A [\t{^}{^}.[\p{{o,t{^}}}{{o,t{^}}}.((={{{o,{o,t{^}}},{o,t{^}}}}_[\x{t{^}}{t{^}}.T
{o}]{{o,t{^}}}){{o,{o,t{^}}}}_p{{o,t{^}}}{{o,t{^}}}){o}]{{o,{o,t{^}}}}]
# wff      29 :  [\t.[\p.((=_[\x.T])_p)]]                   :=  A

## print Axiom 1 (Law of Excluded Middle) - with defined ALL again
## there is no difference to the original result above
%A1
#                ((=_((&_(g_T))_(g_F)))_((A_o)_[\x.(g_x)]))
[...]

## remove all definitions used in Axiom 1
:= &; := T; := F; := A

## print Axiom 1 (Law of Excluded Middle) - without any definition, as it is 
actually in the logic
## consisting only of the constant equality (=), variables, lambda abstraction 
(\), and application (_)
%A1
#                
((=_(([\x.[\y.((=_[\g.((g_((=_=)_=))_((=_=)_=))])_[\g.((g_x)_y)])]]_(g_((=_=)_=)
))_(g_((=_[\x.((=_=)_=)])_[\x.x]))))_(([\t.[\p.((=_[\x.((=_=)_=)])_p)]]_o)_[\x.(
g_x)]))


2.5.2 Extension by constant specification (John: "new_specification")

The specification example in the manual
	Â(b1 = b2)
or
	not (b1 = b2)
with b1 and b2 of type bool could be introduced as a hypothesis h.

Then we would have theorems
	h |- t1
	h |- t2
etc.

If h would be inconsistent (h |- F), we could prove: not h.

Otherwise (the consistent case), the variables (b1 and b2) in h are free, e.g.,
	not (b1 = b2) |- t1
	not (b1 = b2) |- t2
such that the restrictions on free variables in the set of hypotheses for Rule 
R' [Andrews, 2002, p. 214] apply, and hence, the "variables" practically behave 
like constants, as in HOL. Note that neither the Rule of Universal 
Generalization (5220) nor the Rule of Substitution (5221) are applicable due to 
their restrictions on free variables in the set of hypotheses [cf. Andrews, 
2002, p. 222], which means that b1 and b2 could not be instantiated if they 
would appear as free variables in t1 or t2.


2.5.3 Remarks about constants in HOL

This passage, according to my memory the same as the original by Andy Pitts, 
discusses the "dependency on type variables", arguing that "explicitly 
recording dependency of constants on type variables makes for a rather 
cumbersome syntax".
However, with type abstraction, types are not a separate syntactic category 
anymore, but terms of type tau. This means that the type variables are subject 
to the same restrictions of Rule R' as regular variables, resulting in a 
uniform treatment of both regular and type variables, and in addition to that, 
the binding of type variables with lambda preserves dependency information.
With the means of type abstraction, explicit tracking of the dependency on type 
variables can be expressed very naturally with the means of the formal 
language, as (in general) intended by Russell.


2.5.4 Extension by type definition

The corresponding mechanism in R0 was described recently:

"In R0, the interpretation of types is that of (nonempty) sets. Naturally, one 
would see, for example,
the number 3 as a natural number, and therefore has its type:
	3 : N   (or "3_N" in Tex/LaTeX notation)
[...]
The consequence is, of course, that a mathematical entity may have several 
types, as naturally
one would like to express 3 : N, 3 : S5, or 3 : R (three is a real number).
[...]
Since terms like [\x.x<5] (S5, the set of the numbers smaller than 5), can 
become a type, the strict
distinction between terms and types is removed, as when a theorem like
	(S5 3)
is obtained, S5 becomes a type by adding the types of types (tau) to it, and
to the number 3 the new type S5 is added."
	https://lists.cam.ac.uk/pipermail/cl-isabelle-users/2016-October/msg00050.html

The advantage in comparison to the HOL mechanism is, as I believe, that new 
types are introduced very naturally, no axioms have to be assumed, and there 
is no duplication of mathematical structures (linked with a bijection / an 
isomorphism).


In summary, the definitional principles of HOL can be equivalently expressed by
- allowing wffs to be printed (and parsed) by shorthands attached to them,
- Andrews' Rule R' (with a uniform treatment of regular variables and type 
variables),
- and a simple, natural mechanism for introducing new types.
Forgive me if I made a mistake while quickly writing down my ideas.


Concerning HOL Light, I recall that in the current R0 draft (from May 18, 
2015), HOL Light is the main reference, as it has the same underlying concept 
of an extremely small logical kernel:
"The axioms of R0 are only logical axioms. For a number of reasons discussed 
below no non-logical axioms (those not necessary for formal logic) are allowed 
in R0. HOL Light has 'three mathematical axioms' [Harrison, 2009, p. 62], and 
two of them are non-logical axioms: the axiom of choice and the axiom of 
infinity.
In summary, if 'HOL Light is [...] a clean and simplified version of Mike 
Gordon's original HOL system' [Harrison, 2009, p. 60], then R0 is even cleaner 
and simpler than HOL Light and follows consequently the path of maximising 
logical rigor by reductionism." [Kubota, 2015, p. 14 f.]
(Maybe the comparison is slightly unfair, since R0 is not designed for 
automation, opting for Hilbert-style like Q0.)

Also, other design decisions for HOL Light and R0 are identical.
For example, having to decide between variable names and a numbering system for 
variables (like de Bruijn indices), I opted for names in order to make the 
logical reasoning as explicit as possible (besides better readability).
Similarly, you wrote on HOL Light: "I reimplemented all the prelogic 
completely, abandoning de Bruijn terms in favour of a return to name-carrying 
syntax."
	https://www.cl.cam.ac.uk/~jrh13/hol-light/tutorial_220.pdf (p. 222)


Maybe it is best to continue discussion after the publication of R0, when 
comparisons can be made easier.


Best,

Ken

____________________

Ken Kubota
doi: 10.4444/100
http://dx.doi.org/10.4444/100



P.S.:

The full internal representation of Axiom 1 (Law of Excluded Middle), which is 
exposed if all definitions (abbreviations) are removed, is - without and with 
full type information -:

%A1
#                
((=_(([\x.[\y.((=_[\g.((g_((=_=)_=))_((=_=)_=))])_[\g.((g_x)_y)])]]_(g_((=_=)_=)
))_(g_((=_[\x.((=_=)_=)])_[\x.x]))))_(([\t.[\p.((=_[\x.((=_=)_=)])_p)]]_o)_[\x.(
g_x)]))
#                                                           :=  A1
#                
((={{{o,o},o}}_(([\x{o}{o}.[\y{o}{o}.((={{{o,@},@}}_[\g{{{o,o},o}}{{{o,o},o}}.((
g{{{o,o},o}}{{{o,o},o}}_((={{{o,@},@}}_={ at }){{o,@}}_={ at }){o}){{o,o}}_((={{{o,@},
@}}_={ at }){{o,@}}_={ at }){o}){o}]{ at }){{o,@}}_[\g{{{o,o},o}}{{{o,o},o}}.((g{{{o,o},o
}}{{{o,o},o}}_x{o}{o}){{o,o}}_y{o}{o}){o}]{ at }){o}]{{o,o}}]{{{o,o},o}}_(g{{o,o}}{
{o,o}}_((={{{o,@},@}}_={ at }){{o,@}}_={ at }){o}){o}){{o,o}}_(g{{o,o}}{{o,o}}_((={{{o
,{o,o}},{o,o}}}_[\x{o}{o}.((={{{o,@},@}}_={ at }){{o,@}}_={ at }){o}]{{o,o}}){{o,{o,o}
}}_[\x{o}{o}.x{o}{o}]{{o,o}}){o}){o}){o}){{o,o}}_(([\t{^}{^}.[\p{{o,t{^}}}{{o,t{
^}}}.((={{{o,{o,t{^}}},{o,t{^}}}}_[\x{t{^}}{t{^}}.((={{{o,@},@}}_={ at }){{o,@}}_={
@}){o}]{{o,t{^}}}){{o,{o,t{^}}}}_p{{o,t{^}}}{{o,t{^}}}){o}]{{o,{o,t{^}}}}]{{{o,{
o,\3{^}}},^}}_o{^}){{o,{o,o}}}_[\x{o}{o}.(g{{o,o}}{{o,o}}_x{o}{o}){o}]{{o,o}}){o
})



> Am 24.10.2016 um 04:55 schrieb John.Harrison at cl.cam.ac.uk:
> 
> Hi Ken,
> 
> | John's e-mail on the motivation for HOL Light and his comparison with
> | Andrews' systems is highly interesting. I would suggest an article "On the
> | Logical Foundations of HOL Light" or similar to make the information
> | persistent, as the policy of the mailing list provider may change. For now,
> | I have quoted from the publicly available e-mail via the online link.
> 
> I am glad you found it interesting. Indeed, I don't think I've ever
> written a detailed canonical description of the HOL Light foundations,
> though there are brief summaries scattered in various places, so this is
> certainly something to think about.
> 
> On the history of HOL Light and related systems, in addition to those
> you have already discussed, let me shamelessly plug a book chapter
> written by Josef Urban, Freek Wiedijk and myself on the history of
> interactive theorem proving in general. This in particular discusses LCF
> and its successors, especially HOL, in section 3, complete with another
> little "family DAG" of HOL systems on page 19:
> 
>  http://www.cl.cam.ac.uk/~jrh13/papers/joerg.html
> 
> Also of historical interest may be my old unpublished note "HOL Done
> Right":
> 
>  http://www.cl.cam.ac.uk/~jrh13/papers/holright.html
> 
> That describes an intermediate stage in the evolution of HOL Light when
> the logical kernel still had implication as a primitive. The setup
> described there is also of some relevance to the discussion of type
> definition principles in a related thread, because it sketches what (as
> Rob notes in his paper) was the first version of a definitional
> principle that is roughly a "unary" version of his own idea. At the risk
> of reiterating what you can read in Rob's paper, let me explain my
> motivations there.
> 
> Roger Jones gave an excellent description of the origins of what in
> HOL88 were called "new_definition" (a simple |- c = t binding) and
> "new_specification" (introducing constants satisfying a predicate given
> an existential theorem). One more perceived rough edge in HOL88 and
> hol90 that I wanted to fix in HOL Light was that there should indeed be
> two completely different definitional principles when they are
> tantalizingly close to being interderivable:
> 
> * Since |- exists x. x = t is clearly provable, one can derive 
>   "new_definition" fron "new_specification". There is however a
>   bootstrapping problem because "new_definition" is used to 
>   define the existential quantifier on which "new_specification"
>   depends!
> 
> * Once the epsilon operator is present, one can easily derive
>   "new_specification" from "new_definition", but only with the
>   motivating drawback mentioned by Roger Jones that you get
>   "too much information" about the introduced constants.
> 
> In the early versions of HOL Light I chose the slightly different
> approach of taking as primitive a variant of "new_specification" that
> allowed one to introduce a new constant c and a theorem |- P[c] given an
> input theorem |- P[t] for a specific term t, not |- exists x. P[x] as in
> "new_specification". Then no quantifiers are involved, and
> "new_definition" is a trivial instance because of reflexivity |- t = t,
> while "new_specification" can be derived from the epsilon operator
> without exposing too much information. One needs of course some
> restrictions on variables and type variables analogous to those Roger
> discussed. (Amusingly, I too got them wrong in the first instance, as
> Tom Melham spotted instantly when I first showed him the code. This
> was long before HOL Light was a public system.)
> 
> At some point I switched HOL Light back to the simple |- c = t
> "new_definition" because it seemed slightly simpler and more intuitive,
> and I have personally never been very interested in the "too much
> information" problem (I generally want to define things explicitly, not
> leave them underspecificed). I can quite well see the arguments for
> Rob's version though.
> 
> John.





This archive was generated by a fusion of Pipermail (Mailman edition) and MHonArc.