# Re: [isabelle] rigorous axiomatic geometry proof in Isabelle

On Sun, Jun 3, 2012 at 12:53 AM, Bill Richter <richter at math.northwestern.edu
> wrote:

> I'm trying to use my math skills to get into the proof assistant game.
> I know all of you are much better than I am at CS, and I would think
> that you would all be better than I at mathematical logic as well.
> And maybe you are, but I think I understand something that nobody ever
> writes, so I wonder if folks do understand this.
>

I think you'll find that people who use computerised proof technology often
do already have a good understanding of mathematical logic (and
"metamathematics"), including Gödel's theorems.

> Mathematical logic studies what you can and can't prove axiomatically
> in first order logic (FOL).

I would contend that the field goes further and looks at other logics as
well, including extensions like second-order and higher-order logic, as
well as "restrictions" on first-order logic like non-classical logics
http://en.wikipedia.org/wiki/Mathematical_logic.

> The Goedel Completeness Theorem
> http://mathworld.wolfram.com/GoedelsCompletenessTheorem.html says
> truth in every model implies an axiomatic proof, and the Goedel
> Incompleteness Theorem says you can't write a computer program that
> will print out all theorems of a model of the natural numbers:
> http://mathworld.wolfram.com/GoedelsFirstIncompletenessTheorem.html
>

The phrase "all theorems of a model of the natural numbers" would perhaps
To me, theorems are statements with proofs, and models of a statement or a
theory are structures that make that statement or theory true.

The Completeness Theorem says that if a statement is true in every
structure (i.e. every structure models the statement), then the statement
has a proof.
(Implicitly, we first fix a first-order theory, i.e. a set of axioms and a
signature, then only consider statements in the language of that signature,
structures over that signature, and proofs over those axioms. The
Completeness Theorem holds no matter what theory we choose.)
If the set of axioms can be generated by a computer, then all (and only)
the provable statements of a theory can also be generated by a computer.
This is why first-order logic is "semidecidable".

The First Incompleteness Theorem says that if we fix a theory whose axioms
can be generated by a computer, whose signature allows the natural numbers
as a structure, and which proves various statements about the natural
numbers (e.g. Peano's axioms) but doesn't prove every statement (i.e. it is
consistent), then there exists some statement which is true in the natural
numbers but not provable.
Notes: Since the theory proves Peano's axioms but is consistent, the
natural numbers must be a model of this theory. But by the Completeness
Theorem, the unprovable statement cannot be true in all models of this
theory. Therefore the theory must have another "non-standard" model where
the unprovable statement is false.

What you said about a computer program not being able to print all theorems
is not a limitation of computer programs (because a program can always
print out all theorems of a first order theory). It's a limitation of
first-order theories to capture the natural numbers: they always end up
also being true of non-standard models that differ from the natural numbers
but the theory can't tell.

> There's a simpler related result I don't know the name of, which says
> that given any nice set of FOL axioms, you can write a computer
> program to print out all the theorems that follow from them,
> i.e. theorems which have axiomatic FOL proofs using these axioms.
>

Right. I'm not sure this result has any particular name. The "nice set" of
FOL axioms means a "recursively enumerable set".

> Mathematicians basically restrict themselves to the FOL ZFC set theory
> axioms, and so we can write a computer program that will print out (if
> it runs indefinitely) every theorem that mathematicians well ever
> prove (unless they adopt new axioms). Turing's proof of the Halting
> problem (you can't write a infinite-loop checker) is just a
> restatement of the Goedel incompleteness theorem.

The Lambda Calculus comes out of math logic, as it gives a definition of a
> recursive
> function that's equivalent to Turing-machine definable.
>
> The whole point of proof assistants is to make this math logic work!
> Nobody's gonna run the simple program that prints out all the theorems
> that follow from the FOL ZFC axioms and wait for a good theorem to pop
> up.  But you folks have written lots of smart proof assistants which
> do great work on today's incredibly fast machines.
>

A bit of clarification: proof assistants usually consist of a proof
checking part and a proof development part. The proof checking part simply
checks that proofs are valid proofs, i.e. they come from the axioms via
inference rules. The proof development part may include tools for
structuring theories and tools for generating proofs automatically or
semi-automatically.

>
> So what do proof assistants (Coq, Isabelle, HOL Light) do?  I would
> assume they all start with some FOL axioms and then deduce axiomatic
> FOL proofs as one discusses in math logic.  I contend that the proof
> assistants must do that, because (by math logic) they can't do
> anything else!  And the mathematicians can't do anything else either!
>

No it's not true, because neither proof assistants nor mathematicians are
restricted to FOL.
Also, none of the proof systems really do any proving (unless you count the
automated components); rather, they enable you to construct a formal proof,
while checking all the time that what you're doing is valid.
In HOL Light and Isabelle/HOL (Isabelle itself is a logical framework that
can be instantiated at many different logics) let you conclude statements
that are either HOL axioms or derived from other theorems via HOL rules of
inference. The Wikipedia page has a list of the inference rules for HOL
Light http://en.wikipedia.org/wiki/HOL_Light. These systems don't actually
construct proofs (by default); rather, they construct "theorem" objects,
and limit the methods for constructing theorems to exactly the rules of
inference, thereby ensuring that whenever you have a theorem object, you
know a proof of it must exist (it was created ephemerally and dynamically
in the object's creation). This is the so-called "LCF approach".
Coq, on the other hand, does construct proofs by default, but in another
logic again: dependent type theory. In this logic, the proofs and the
statements live in the same language.
I'm sure others will be able to fill in more details here if you're
interested...

> But I'm puzzled because almost nobody talks this way in the proof
> assistant world.  Mizar (and Freek's miz3 port to HOL Light) seems
> reasonably close to FOL axiomatic proofs.  Although I don't know
> exactly how close.  I have this vague idea that Mizar does axiomatic
> FOL proofs with a lot of the tedium automated away, e.g. manually
> substituting variables.  But I don't know.  The things that folks
> actually do---HOL, Isar, Pure---I don't understand.  I would need to
> have them explained in terms of Goedel's field of math logic,
> i.e. axiomatic FOL proofs. So I propose that if someone knows the answer,
> they tell me, and if nobody knows the
> answer, let's run a thread on math logic which will eventually explain
> how HOL, Isar, Pure etc relate to Goedel's FOL.  I have very little to
> contribute to such a thread beyond the note on the LC Y combinator and
> Goedel incompleteness below.
>

Well, perhaps we can continue after you've read about the syntax and
semantics of HOL and let's see if you have more specific questions then.

>
> --
> Best,
> Bill
>
>
> We can understand Odifreddi's [p 80--1] startling remark that the Y
> combinator "embodies the argument used in Russell's paradox" from
> Boolos & Jeffries [Ch 14-15].
>
> That is, I'll show how the Lambda Calculus Y combinator comes from
> Goedel Incompleteness, and how Goedel I sort of comes from Russell's
> paradox.  In particular, B&J's treatment of the Goedel fixed point
> theorem is much clearer than Barendregt's [Thm (Goedel), sec 6.7].
>
>
> ******** Goedel's fixed point thm => Y combinator ********
>
> For any expression B(y) of number theory, B&J [Lem 2, p 173] show
> there exists an expression F(x) such that for any expression M(x),
>
> Q |-   F( #M ) <--> B( #M(#M) )
>
> where Q is the minimal axiom set for number theory [B&J Ch 14], s.t. a
> partial function N ---> N is representable in Q iff it's
> recursive.  I'm denoting Goedel numbers by #.
>
> Then letting G = F(#F) and plugging in, we have Goedel's result
>
> Q |-  G <--> B( #G )
>
> Well, that gives us the Y combinator, just take out the #s and the
> Qs.  We want
>
>         (F M) = (B (M M))
>
> for all M, so we let
>
>          F = \m. B (m m)
>          G = F F
> then
>          G = B G
>
> giving us the fixed point for B, which we encode as the Y combinator
>
>          Y = \b. (\f. f f) (\m. b (m m))
>
> That's the only reasonable motivation I've ever seen for the Y
> combinator.  And maybe that explains that LC_v requires LC: as you
> say, LC_v is for programming, LC is for Math logic.  But we need Y in
> order to be able to derive the harder Y_v.
>
> ******** Goedel fixed point thm => Goedel I ********
>
> Let E be the expression of number theory, E_0 the closed expressions,
> or statements.  Let Th(N) be the statements that are true in the
> standard model of N.  We'll write, for statements f in E_0,
>
> |= f
>
> if f in Th(N).
>
> Goedel proved that Th(N) is not decidable, meaning there is no
> expression B(x) in E such that for any statement f in E_0,
>
> |=  f   if   Q |-  B(#f)
> |= -f   if   Q |- -B(#f)
>
> But this statement implies the more tractable sounding
>
> |= f   iff   |= B(#f)
>
> since provable in Q implies true in N.  So we show this statement is
> false, which shows the previous statement is false too.
>
> Let G = F(#F) be the "fixed point" for -B (which means not B'), so
>
> Q |-  G = F( #F ) <--> -B( #F(#F) ) = -B( #G )
>
> so pushing from Q to Th(N) and combining with the above, we have
>
> |= G    iff    |=  -B( #G )  iff |= -G
>
> a contradiction.  Hence Th(N) is undecidable.
>
>
> ********  Russell's paradox => Goedel I  ********
>
> We also have that for any expression M(x),
>
> |= F( #M )  iff  |= -B( #M(#M) )  iff |= -M(#M)
>
> The statement F( #M ) means F is satisfied by the Goedel number of M.
> Let's translate F( #M ) into set theory as M \in F.   Then we have
>
> M \in F   iff   M \notin M
>
> That is, F is the "set" F = { M : M \notin M }.  That's the Russell
>
>                      F \in F   iff   F \notin F
> or
>                            G   iff   not G
>
> And we can even see where F & G came from.  Aping Russell, we'd like
> to define a formula F(x) such that for any M(x),
>
> F is satisfied by #M iff #M does not satisfy M.
>
> Well, we can do that if Th(N) is decided by B(y), we "define"
>
> F(#M) = -B( #M(#M) )
>
> Of course it's not clear that such an expression F(x) exists.  Let's
> recall B&J's argument, since I don't quite understand it.
>
> Recall \exists is the Tex symbol for "there exists", the backwards E.
>
> Then B&J define
>
> diagonalization: E ---> E
>
> by diagonalization(M) = \exists_x ( x = #M & M )
>
> So if N = N(x), i.e. x is the only free variable in N, then
>
> Q |-  diagonalization(N) <--> N(#N)
>
> Note the privileged position of the variable x here.  Then B&J show
> [Lem. 1, p 172] that diagonalization is representable in Q (don't
> understand the proof yet), meaning that there's an expression A(x,y)
> in E such that for any expression M in E,
>
> Q |- \exists_y ( A(#M, y) <--> y = #diagonalization(M) )
>
> Then we define the expression F(x) by
>
> F = \exists_y ( A(x,y) & B(y) )
>
> so that for any M,
>
> Q |-   F(#M) <--> B( #diagonalization(M) )
>
> Then letting G = diagonalization(F), we have
>
> Q |-  G  <--> F(#F) <--> B( #diagonalization(F) ) = B( #G )
>
>
> ********  Caution about analogies ********
>
> Why not set-theorize Goedel's fixed point theorem?  For any B(y),
>
> Q |-   F( #M ) <--> B( #M(#M) )
>
> I guess that's
>
>  M \in F   iff (M \in M) \in B
>
> so F is the "set"
>
> F = { M : (M \in M) \in B }
>
> Hmm, looks pretty fishy...  I can't figure out how to make sense out
> of that.  Or, how to translate into number theory Cantor's argument
> from which Russell's paradox springs:
>
> Given any set X and any function h: X ---> 2^X, h is not onto.
>
> Proof: Let R = { x \in X : x \notin h(x) }.  Then we claim R \in 2^X
> is not in the image of h.  If it were, say R = h(y), then we have a
>
> y \in R    iff    y \in h(y)    iff    y \notin R
>
> \qed
>
> Sandy thought that thinking of 2^X as truth functions on X rather than
> subsets of X might get us somewhere, but I don't see anything yet...
>
>
> For another hazy analogy, Odifreddi says on p 81
>
> Set Theory                Lambda Calculus
> element                   argument
> set                       function
> membership                application
> set formation             lambda abstraction
> set equality              term equality
>
> Then he says that \x. N (x x) is the LC analogue of the Russell set
> { x : x \notin x }
> for any term N "that is never the identity".  Pretty shaky...
>
> So maybe we should give Goedel some credit for making the translation
> of Russell's paradox to number theory :-D
>
> ********************************************************
>
> Sandy Zabell pointed me to Goedel's original argument in Nagel &
> Newman's book "Goedel's Proof".  I was so proud to have discovered
> Russell's paradox in Boolos & Jeffries's proof, but N&N make the
> connection quite clear.  They say that Goedel pointed out that he was
> following Richard's parodox, which was concocted in 1905 by a guy
> Jules Richard I never heard of, but it goes like this:
>
> Enumerate the formulas M(x) of LA as F_n(x), and ask whether F_n(n) is
> true, which is like the question X \in X. Then let R(n) be the
> statement -F_n(n), which is like the Russell set
> R = { X : X \notin X}.
> Assume (here's the contradiction) that R(n) is given by one of our
> formulas, so that R(x) = F_m(x), for some m, and then
>
>                       F_m(m) = R(m) = -F_m(m)
>
> which is Richard's paradox.  Here's a less cryptic account (to me)
> than N&N's of how Goedel might have modified Richard's paradox to
> prove that any consistent axiomatizable theory T containing Q is
> incomplete.  I actually end up with different conclusions than N&N and
> Goedel, so I'd be interested in you reading this.
>
> LA is the 1st order language of Arithmetic.  A formula M of LA is
> written M(x) if FV(M) = {x}.  Given a number n \in N, the standard
> model of LA, they write M(n) for what in the LC we'd call M[x<-n],
> which doesn't require x to be free in M, and this is generalized in LA
> as
>                     M(n) := \exists_x. x = n & M
>
> After inventing Goedel numbers, Goedel could ask whether M(#M) is
> false, for any formula M(x), and that sounds like we want
>
> R(#M) = -M(#M)
>
> but there is no formula R(x) of LA which gives this.  However, he
> replaced this with the more subtle statement
>
>        R(#M) is true in N   iff    M(#M) is not provable in T
>
> which is the same as
>
>             |= R(#M)        iff    #M(#M) \notin #T
>
> and then produced R(x) by showing that there's a total recursive
> function d: N ---> N such that for any M(x),
>
> d(#M) = #M(#M)
>
> and that recursive means representable in T (or even better, Q), and
> that #T is recursively enumerable.   Let's go through this:
>
> B&J [Lem. 1, p 172] explain the function d quite well, why it's
> represented in Q by a formula D(x,y) of LA, that
>
> Q |- \exists_y ( D(#M, y) <--> y = #diagonalization(M) )
>
> where M(#M) is generalized to formulas M not of the form M(x) by
>
> diagonalization(M) = \exists_x ( x = #M & M )
>
>
> Now we need a formula C(y) of LA s.t. for any closed formula M,
>
>                    |= C(#M)    iff    M \notin T
>
> This follows from r.e. theory, but I haven't seen it written down
> anywhere.  N&N's argument here is cryptic to me.
>
> As B&J [pf. of Thm. 5, p 177] show, T being axiomatizable implies that
> #T is is recursively enumerable, meaning there's a a total recursive
> function f: N ---> N with Image(f) = #T.  Then as B&J prove [Ch 14],
> there's a formula A(x,y) s.t.
>
> f(n) = k  iff   Q |- \forall_y . A(n,y) <-> y = k
>
> Then we claim that B(y) = \exists_x A(x,y) defines Image(f) subset N
> in the sense that
>
>                  |= B(k)    iff    k \in  image(f)
>
> Proof:
>
> |= B(k)  =>  there exists n \in N s.t.  |= A(n,k)
>
> Since f is a total function, f(n) = l for some l \in N, but then k = l
> is provable in Q.  The other direction is easy.
> \qed
>
>
> Now let f and A(x,y) represent #T, and let C(y) = -B(y), so we have
>
>                  |= C(k)    iff    k \notin  #T
>
> so for any closed formula M, we have
>
>                  |= C(#M)    iff    M \notin T
>
> Now we can define R(x), since we've shown the 2 pieces of R are
> representable in Q.  Our desired relation is now
>
>                  |= R(#M)    iff    |= C( #M(#M) )
>
> and we achieve this by
>
> R(x) = \exists_y . D(x, y) & C(y)
>
> Then we get our Russell/Richard/Goedel contradictory sentence by
>
> G = diagonalization(R)
>
> and we have
>
> Q |-  G  <--> C( #G )
>
> as follows:
>
> Q |-   G <--> R(#R)
>
> Q |-   R(#M) <--> C( #diagonalization(M) ),   for any formula M of LA,
>
> Q |-  G  <--> C( #diagonalization(R) ) = C( #G )
>
> As before, I claim that we should think of this as the path to the
> Lambda Calculus Y combinator.  We note that the argument works for any
> C, and it looks like
>
> R(#M)  = C( #M(#M) )
>     G = R(#R)
>     G = C( #G )
>
> and in LC this makes perfect sense and gives a fixed point for C by
> stripping the pound signs:
>
> let  R = (lambda m. C (m m))
> let  G = (R R)
> then G = (C G)
>
> That's the only good motivation for the LC Y combinator I know of
>
> Y = (lambda c. (lambda r. (r r)) (lambda m. c (m m)))
>
> TLS makes a good try to motivate Y_v, but I don't quite buy it.
>
> Well, on to the contradiction.  Since T contains Q we have
>
> T |-  G  <--> C( #G )
>
> T |- G iff T |- C( #G ) implies |= C( #G ) iff G \notin T
>
> T |- G    implies    G \notin T
>
> Therefore, since T is assumed to be consistent, T cannot prove G,
> i.e. T |- G must be false.  On the other hand, G is true in N, since
>
> |= G  iff |= C( #G ) iff G \notin T
>
> and we just showed that G \notin T.  So  T is strictly smaller than
> Th(N).
>
>
> Now N&N claim to prove that
>
>                        T |- G    iff  T |- -G
>
> and I don't how they can prove that, since T |- C( #G ) doesn't tell
> us anything without passing to |= C( #G ).  I've based my treatment on
> what I could glean from their account.  To me,
>
> \forall z. - Dem(z,x)
>
> means x isn't provable in T and that's where my C comes from, but it
> has to take place in N, so I think they're gapping.
>
> Well, N&N isn't giving Goedel's argument anyway, because they say that
> Goedel actually proved
>
>            T |- G    implies    T |- -G
>
>            T |- -G   implies    LA is omega-inconsistent,
>
> Hmm...
>
> Bill
>
> For completeness, here is my earlier derivation.
>
>
> ********  Russell's paradox => Goedel I  ********
>
> We also have that for any expression M(x),
>
> |= F( #M )  iff  |= -B( #M(#M) )  iff |= -M(#M)
>
> The statement F( #M ) means F is satisfied by the Goedel number of M.
> Let's translate F( #M ) into set theory as M \in F.   Then we have
>
> M \in F   iff   M \notin M
>
> That is, F is the "set" F = { M : M \notin M }.  That's the Russell
>
>               F \in F   iff   F \notin F
> or
>                            G   iff   not G
>
> And we can even see where F & G came from.  Aping Russell, we'd like
> to define a formula F(x) such that for any M(x),
>
> F is satisfied by #M iff #M does not satisfy M.
>
> Well, we can do that if Th(N) is decided by B(y), we "define"
>
> F(#M) = -B( #M(#M) )
>
> Of course it's not clear that such an expression F(x) exists.  Let's
> recall B&J's argument.
>
> Recall \exists is the Tex symbol for "there exists", the backwards E.
>
> Then B&J define
>
> diagonalization: E ---> E
>
> by diagonalization(M) = \exists_x ( x = #M & M )
>
> So if N = N(x), i.e. x is the only free variable in N, then
>
> Q |-  diagonalization(N) <--> N(#N)
>
> Note the priveleged position of the variable x here.  Then B&J show
> [Lem. 1, p 172] that diagonalization is representable in Q, meaning
> that there's an expression A(x,y) in E such that for any expression M
> in E,
>
> Q |- \exists_y ( A(#M, y) <--> y = #diagonalization(M) )
>
> Then we define the expression F(x) by
>
> F = \exists_y ( A(x,y) & B(y) )
>
> so that for any M,
>
> Q |-   F(#M) <--> B( #diagonalization(M) )
>
> Then letting G = diagonalization(F), we have
>
> Q |-  G  <--> F(#F) <--> B( #diagonalization(F) ) = B( #G )
>
>
>
> I now think that Felleisen's argument in TLS [p 157-9] for the
> nonsolvability of Halting Problem isn't actually a proof.  I still
> like his argument, it's the 1st thing I ever understood about the
> connection between Goedel I & Russell's paradox.  I think it's quite
> suitable for TLS, and you don't claim that you've proven the theorem.
> However...
>
> He proves that there exists no Scheme function will-stop? that returns
> #t or #f on a function f depending on whether (f '()) halts or not.
> Then you give a nice Russell-ish argument,
>
> (define (last-try x)
>  (if (will-stop? last-try)
>      (eternity x)))
>
> (last-try '()) halts  <=> (last-try '()) doesn't halt
>
> But that doesn't prove the Halting Problem is unsolvable, because we
> can't perform any computation on Scheme expressions in Scheme.  I
> think that's true, anyway.
>
> Moving from LC_v to LC, Turing etc. proved that computable functions
> on N are lambda-definable, but I don't think computable functions on
> Lambda are definable in LC.  That's why they use the kludge of Goedel
> numbering
>
> #: Lambda ---> N
>
> to handle computations on Lambda, and then the #-fixed point theorem
>
> F [#X] = X
>
> Using this, Dana Scott showed [Hankin, ch 6] that
>
> {X in Lambda : X has a beta-nf}
>
> is undecidable, and that's essentially the nonsolvability of Halting
> Problem.  That is, there's no computable function h: N ---> {0,1} such
> that h^{-1}( 1 ) is the image
>
> {X has a beta-nf} subset Lambda -#--> N
>
> I think it would be interesting to port these nonsolvability theorems
> to LC_v, can you tell me where that was done?  And even better, is
> there some way of jazzing up LC_v to handle computations on Lambda?
> Maybe use Lambda as Data Constants?  I know that Scheme has functions
> that would allow you to do some computations on Lambda, like symbol? &
> procedure?.   But I still think no matter how rich Scheme is, that
> Scheme won't capture computations on Scheme expressions.
>
> *****************************************************
>
> I finally understand Odifreddi's cryptic derivation
>
>               Russell's paradox    =>    Y combinator
>
> Maybe Odifreddi's argument was badly translated from Italian, or maybe
> he didn't understand it himself.  I suspect Curry understood it,
> because Y is sometimes called "Curry's paradoxical combinator".
>
> I'm really happy with this argument, as it avoids the Goedel numbers
> that came up in the previous derivation via Goedel I.
>
> The proof of Russell's paradox (basically due to Cantor) can be
> generalized as follows.  See below for why this is a generalization.
>
> Let X and A be sets, and N: A ---> A be a function with no fixed
> point.  Let A^X be the set of functions from X to A.  Then Cantor
> showed that X is smaller A^X, and in fact his proof shows that for any
> function
>
> h: X ---> A^X
>
> we can construct an element R \in A^X that's not in the image of h.
> Thus, no function h is onto, and that's what smaller means.
>
> Our function h is "adjoint" to a function
>
> h': X \times X ---> A          h'(y,x) = h(y)(x)
>
> and then we can define the function R: X ---> A by
>
> R(x) = N( h'(x,x) )
>
> If we assume h is onto, we get the diagonal argument counterexample:
> assume R = h(y) for some y \in X, then
>
> R(x) = h'(y,x)  for all x \in X, but then
>
> R(y) = N( h'(y,y) ) = h'(y,y)
>
> and so h'(y,y) is a fixed point of N, and that's a contradiction.
>
>
> Well, the contrapositive of this argument gives a technique to produce
> fixed points!  I wouldn't have thought of it before reading it in
> Odifreddi, but it makes sense.  Suppose we think N: A ---> A really
> has a fixed point, and we have a function
>
> k: X \times X ---> A
>
> Let's define as above the function R: X ---> A by
>
> R(x) = N( k(x,x) )
>
> and let's suppose we can show that for some y \in X, we have
>
> R(x) = k(y,x)  for all x \in X.
>
> Then we know that N has a fixed point.
>
> And furthermore k(y,y) is a fixed point for N, since
>
> R(y) = N( k(y,y) ) = k(y,y)
>
>
> Now for the LC Y combinator.  Let A = X be the set Lambda/= of lambda
> expressions modulo alpha & beta equivalence, and application gives us
> a function
>
> k: X \times X ---> X
>
>       (U,V) |--> U V
>
> Given N \in X, we have a function
>
> N: X ---> X
>
>  U |--> N U
>
> Now we look at the function R: X ---> X by
>
> R(x) = N (x x)
>
> and we can see that for
>
> y = \x. N (x x)
>
> we have
>
> R(x) = k(y,x)  for all x \in X
>
> So the Russell/Cantor argument above tells us that
>
> y y = k(y,y) is a fixed point for N,
>
> and that's the LC Y combinator,
>
> \n. (\y. y y) (\x. n (x x))
>
> Isn't that nice!
> \qed
>
> There's no need to modify this derivation of Y to give Y_v of LC_v,
> because Y_v is the port of Y to LC_v, once we understand where the
> formula for Y came from, we don't have to wonder about where Y_v came
> from, in case we aren't satisfied by the derivation of Y_v in TLS :)
>
> BTW I note that your partial-derivation of Y_v in TLS comes
> immediately after your partial-proof of the nonsolvability of the
> Halting Problem, so I could easily believe you knew this argument, but
> presented something more suitable for the young in TLS.
>
>
> Russell's paradox is really Cantor's theorem that any set X is smaller
> than its power set 2^X.  In fact, given any function
>
> h: X ---> 2^X
>
> we can construct an element R \in 2^X that's not in the image of h:
>
> R = { x \in X : x \notin h(x) }
>
> Recall the contradiction: assume R = h(y) for some y \in X, then
>
>         y \in R    iff    y \notin h(y)    iff   y \notin R
>
> But instead let's think of 2^X as the set of functions from X to
> {T,F}, and we have the function
>
> -: {T,F} ---> {T,F}
>
> that switches T and F, then R \in 2^X is the function
>
> x |--> - h(x)(x)
>
>   We see this translation of R because the correspondence between the
>   power set and function is
>
>   f: X ---> {T,F}      |-->      f^{-1}(T) subset X
>
>   So
>
>   - h(x)(x) = T    iff    x \notin h(x)^{-1}(T) subset  X
>
> Now we have the usual diagonalization argument:
>
> R = h(y) means
>
> - h(x)(x) = R(x) = h(y)(x)              for all x \in X
>
> but that gives the contradiction
>
> - h(y)(y) = h(y)(y)    \in       {T,F}
>
>
> To really get Russell's paradox, if X is the nonsensical set of all
> sets, then 2^X is a subset of X.  After all, any subset of X is a set,
> hence an element of X.  So instead of a function
>
> h: X ---> 2^X
>
> we might as well take the composite
>
> X -h--> 2^X subset X
>
> and we have such a function, the identity function.  Then
>
> x \notin h(x)
>
> translates to
>
> x \notin x
>
> and we have the usual Russell set
>
> R = { x \in X : x \notin x }
>
> and y = R, so the paradox is
>
>         R \in R    iff    R \notin R
>
> And this last paradoxical sentence has a flavor that's shared by your
> TLS argument
>
> (define (last-try x)
>  (if (will-stop? last-try)
>      (eternity x)))
>
> (last-try '()) halts  <=> (last-try '()) doesn't halt
>
> even if last-try isn't really analogous to R.  But it's a start, and I
> was really thrilled to see your argument, it was the 1st thing I'd
> ever understood about the proof of Goedel I.  But now in Sandy
> Zabell's class I went through the proof of the nonsolvability of the
> Halting Problem as well as Goedel I and I was able to squeeze a much
> stronger connection between Russell's paradox and Goedel I/Halting
> from Boolos & Jeffries, which I already posted, but I only worried
> about it to deduce Y, which we now see comes directly from Russell's
> paradox.  Hard to say my Goedel efforts were wasted, though, it's a
> nice theorem :)  But I'd sure like to get the Goedel numbers out of the
> Scott's undecidability of
>
> {X in Lambda : X has a beta-nf}
>
>
>

`

This archive was generated by a fusion of Pipermail (Mailman edition) and MHonArc.