[isabelle] Code generation & type classes - inefficient code


Generation of ML code from Isabelle mimics type classes by passing around explicit records containing the functions specific to the type class.

E.g., you might get a function "inner_product A_ a b = map (#times A_) (zip a b)" (roughly).

This works well, however, I noticed that the resulting code can be very inefficient because the same records are constructed over and over. For example, the term "[enumhd,enumhd,enumhd,enumhd,enumhd,enumhd] :: (bool*bool*bool*bool*bool) list" (where enumhd is some constant we defined using typeclass enum, the details are immaterial here) generates the code

val code = [ enumhd
     (Enum.enum_prod Enum.enum_bool
       (Enum.enum_prod Enum.enum_bool
         (Enum.enum_prod Enum.enum_bool
           (Enum.enum_prod Enum.enum_bool Enum.enum_bool)))),

  ... repeated six times ... ]

(Demonstrated in the attached Scratch.thy)

The effect of this is two-fold:

 * The generated code becomes much longer and thus compiles potentially
   very slowly (I had a term in my application that was ~2500 lines of
   code, and became ~150 lines after removing the repeated type class
   generation code. The 2500 lines took ~40-60 seconds to compile which
   means that a simple "by eval" takes that long.)
 * A runtime overhead is incurred. E.g., the Enum.enum_prod function
   above is internally producing products of lists over an over again.
   This could be prohibitive if the Enum.enum_prod (or similar) occurs
   in a function that is used in a loop.

I would have expected code like this:

   val code = let val enum1 = (Enum.enum_prod Enum.enum_bool
           (Enum.enum_prod Enum.enum_bool
             (Enum.enum_prod Enum.enum_bool
               (Enum.enum_prod Enum.enum_bool Enum.enum_bool))))

                            val enumhd1 = enumhd enum1

   in [enumhd1,enumhd1,enumhd1,enumhd1,enumhd1,enumhd1] end

This would compute every class only once (and if enumhd creates typeclasses, this would also be done only once during the computation of enumhd1).

(The general case would be: "f::args=>result" becomes "fun f <typeclasses> = let <compute type classes and instantiate constants> in fn args => <code> end". This way, invoking "val f1 = f <typeclasses>" already computes all recursively type classes, and then "f1 args" can be used several times without recomputing the type classes.)

My questions:

 * Is there a way to tell the code generator to do things more
   efficiently? (E.g., the way I described.)
 * Perhaps future versions of the code generator could do this? (I
   don't know if the code generator is still being developed.)

Best wishes,

theory Scratch
  imports Main

(* A contrieved example *)
definition "enumhd = hd Enum.enum"
definition "code = ([enumhd,enumhd,enumhd,enumhd,enumhd,enumhd] :: (bool*bool*bool*bool*bool) list)"

(* To inspect the code *)
export_code code in SML file_prefix "enumhd"
(* Result: code that contains many copies of

     (Enum.enum_prod Enum.enum_bool
       (Enum.enum_prod Enum.enum_bool
         (Enum.enum_prod Enum.enum_bool
           (Enum.enum_prod Enum.enum_bool Enum.enum_bool))))

   The type classes are contructed over and over at runtime and slow down compiling.

(* Running the code *)
value code

(* This should give us potential speedups *)
code_reflect enumhd functions enumhd
(* But it does not work because code_reflect produces ML function with different type class arguments *)
value code

This archive was generated by a fusion of Pipermail (Mailman edition) and MHonArc.