Re: [isabelle] Non-idempotence of datatype constructors

I attached a proof of concept (works with Isabelle 2020) using the
simple size-based approach, including some example applications.

It works well, although I'm not sure what the proper way to get the
datatype information is (e.g. the list of all the constructors of the
datatype and the associated other datatypes in case of mutual recursion).

Is the ML interface of the BNF package documented anywhere (in
particular this aspect)?


On 02/05/2020 16:19, Manuel Eberl wrote:
> True, but after your suggestion, I realised that the solution with the
> "proper subexpression" relation (or, alternatively, the size function)
> combined with a simproc that produces these theorems on the spot is
> actually the superior approach in every respect. It's simpler, more
> general, and probably more performant.
> I can try to come up with a proof-of-concept implementation using the
> size function approach, since that needs no additional new features from
> the BNF package.
> Manuel
> On 02/05/2020 16:16, Tobias Nipkow wrote:
>> A first version which only deals with depth 1 would already cover most
>> of the practical cases.
>> Tobias
>> On 02/05/2020 15:50, Manuel Eberl wrote:
>>> That sounds like a good idea.
>>> However, if such a simproc is to work for any nesting of
>>> constructors,having pre-proven theorems for every constructor will not
>>> be enough.Instead, I suppose one would have to introduce a
>>> "proper-subexpression"relation for datatypes (e.g. xs < Cons x xs) along
>>> with a proof thatthis relation has the obvious properties (irreflexive,
>>> asymmetric,transitive).
>>> I guess that is something that only a datatype package plugin similar
>>> tothe one for the "size" function could provide. Having looked at the
>>> codebriefly, I think only the people who wrote the BNF package could (or
>>> atleast should) implement that.
>>> Alternatively, one could just use the size function (as someone
>>> alreadysuggested in this thread) to get pretty much the same thing,
>>> except thatit won't work for all datatypes (e.g. infinitely branching
>>> ones).
>>> Manuel
>>> On 02/05/2020 15:36, Tobias Nipkow wrote:
>>>> I do think such rules are useful, esp if they are there by default. I
>>>> suggest they are best handled by a simproc that is triggered by any
>>>> "(=)" but that checks immediately if the two argumenst are of the
>>>> appropriate type and form. That can be done very quickly (there are
>>>> similar simprocs already). The simproc should work for any datatype and
>>>> any degree of nesting of the constructors.
>>>> Tobias
>>>> On 01/05/2020 22:51, Manuel Eberl wrote:
>>>>>> Firstly, I don't think these theorem is especially useful. You might
>>>>>> have planned to add this to the simplifier, but its term net
>>>>>> doesn't do
>>>>>> any magic here. It will end up checking every term that matches
>>>>>> "Cons x
>>>>>> xs = ys" for whether "xs" can match "ys". I'm not sure if that
>>>>>> matching
>>>>>> is equality, alpha-equivalent or unifiable.
>>>>> I honestly never think /that/ much about the performance
>>>>> implications of
>>>>> simp rules (as long as they're unconditional). At least for lists, by
>>>>> the way, this is already a simp rule by default though, and lists are
>>>>> probably by far the most prevalent data type in the Isabelle universe.
>>>>> But you're certainly right that it would make sense to keep a look at
>>>>> this performance impact if one wanted to add these to the simp set for
>>>>> all datatypes by default, and I agree that the rules are probably not
>>>>> helpful /that/ often. Still, it might be nice to have them available
>>>>> nonetheless.
>>>>>> Secondly, you can prove these theorems by using this handy trivial
>>>>>> theorem : "size x ~= size y ==> x ~= y". Apparently that theorem
>>>>>> has the
>>>>>> name  Sledgehammer.size_ne_size_imp_ne - presumably the sledgehammer
>>>>>> uses it to prove such inequalities.
>>>>> It's even easier to prove it by induction. Plus, in fact, the "size"
>>>>> thing only works if the data type even has a sensible size function.
>>>>> That is not always the case, e.g. when you nest the datatype through a
>>>>> function.
>>>>> My main point however is that when you have a datatype with a dozen
>>>>> binary constructors, there's quite a bit of copying & pasting involved
>>>>> before you've proven all those theorems. Since it can (probably) be
>>>>> automated very easily, why not do that? Whether or not these should be
>>>>> simp lemmas by default is another question.
>>>>> Manuel
theory Foo
imports Main

lemma size_neq_size_imp_neq: "size x \<noteq> size y \<Longrightarrow> x \<noteq> y"
  by (erule contrapos_nn) (rule arg_cong)

ML \<open>

exception DATATYPE of string

val size_neq_imp_neq_thm = @{thm HOL.Eq_FalseI[OF Foo.size_neq_size_imp_neq]}

fun mk_size T = Const (@{const_name "Nat.size"}, T --> @{typ "Nat.nat"})

fun get_bnf_sugar ctxt s =
  case BNF_FP_Def_Sugar.fp_sugar_of ctxt s of
    SOME sugar => sugar
  | NONE => raise DATATYPE s

fun get_ctors (sugar : BNF_FP_Def_Sugar.fp_sugar) =
  |> #fp_ctr_sugar
  |> #ctr_defs
  |> map (Thm.concl_of #> Logic.dest_equals #> fst #> dest_Const #> fst)

fun get_mutuals (sugar : BNF_FP_Def_Sugar.fp_sugar) =
  |> #fp_res
  |> #Ts
  |> map (dest_Type #> fst)

fun get_ctors_mutuals ctxt sugar =
  sugar |> get_mutuals |> map (get_bnf_sugar ctxt #> get_ctors) |> flat

fun get_size_info ctxt s =
  case BNF_LFP_Size.size_of ctxt s of
    SOME info => info
  | NONE => raise DATATYPE s

fun is_comb (_ $ _) = true
  | is_comb _ = false

(* TODO: possible improvements:
   - better/more canonical way to get list of constructors of the cluster of mutually
     recursive datatypes?
   - presimplify generated premise using the equations of the size function
   - replace size-based proof with proper subexpression relation
fun datatype_no_proper_subterm_simproc ctxt ct =
    val (clhs, crhs) = ct |> Thm.dest_comb |> apfst (Thm.dest_comb #> snd)
    val cT = Thm.ctyp_of_cterm clhs    
    val T = Thm.typ_of cT
    val T_name = T |> dest_Type |> fst

    val sugar = get_bnf_sugar ctxt T_name
    val _ = get_size_info ctxt T_name
    val ctors = get_ctors_mutuals ctxt sugar
    val is_ctor = member op= ctors
    val (lhs, rhs) = apply2 Thm.term_of (clhs, crhs)

    fun is_subterm s t =
        fun go t =
          s aconv t orelse (is_comb t andalso go' (strip_comb t))
        and go' (Const (c, _), ts) = is_ctor c andalso exists go ts
          | go' _ = false
        go t

    fun mk_thm () =
      SOME (Drule.instantiate'_normalize [SOME cT] [SOME clhs, SOME crhs] size_neq_imp_neq_thm)
    if is_subterm lhs rhs orelse is_subterm rhs lhs then mk_thm () else NONE
    handle DATATYPE _ => NONE
      | CTERM _ => NONE
      | TERM _ => NONE
      | TYPE _ => NONE


simproc_setup datatype_no_proper_subterm ("x = y") = \<open>K datatype_no_proper_subterm_simproc\<close>

(* Tests *)

datatype 'a mytree = A 'a | B "'a mytree" "'a mytree"

ML_val \<open>
  datatype_no_proper_subterm_simproc @{context}
    @{cterm "B l r = l"}

lemma "B l r \<noteq> l" and "B l r \<noteq> r" and "l \<noteq> B l r" and "r \<noteq> B l r"
  by simp_all

  fix a b c d e f :: "nat mytree"
  define t where [simp]: "t = B (B (B a b) (B c d)) (B e f)"
  have "\<forall>x\<in>{a,b,c,d,e,f}. t \<noteq> x"
    by simp
  have "\<forall>x\<in>{a,b,c,d,e,f}. x \<noteq> t"
    by simp

(* Tests for mutually recursive datatypes *)

datatype 'a myty1 = A1 'a | B1 "'a myty1" "'a myty2"
     and 'a myty2 = A2 'a | B2 "'a myty2" "'a myty1"

ML_val \<open>
  datatype_no_proper_subterm_simproc @{context}
    @{cterm "c = B2 a (B1 b c)"}

lemma "B1 a (B2 b c) \<noteq> a" and "B1 a (B2 b c) \<noteq> c"
  by simp_all

lemma "B2 a (B1 b c) \<noteq> a" and "B2 a (B1 b c) \<noteq> c"
  by simp_all


This archive was generated by a fusion of Pipermail (Mailman edition) and MHonArc.