Re: [isabelle] Questions about Quickcheck



Dear Manuel,

since I am subscribed the isabelle users' mailing list for a couple of
days now, I can tell you some of the rationales for the design
decisions in quickcheck.

Three years ago, I did some profiling of quickcheck's execution and
observed that much time was actually spent constructing and
deconstructing the term representations for printing no matter if
counterexamples were found or not. For fair comparison between
different testing approaches that all are based on the same code
generation setup and are quasi-complete, in the sense, if there is a
counterexample, it will eventually be found (assuming the
counterexample can be represented as ground term with the given code
generation setup), the testing performance is essential. The overall
testing performance would render one approach superior or inferior to
another one.

Exhaustive generators are more performant than full-exhaustive, but do
not allow to print terms when functions are involved.
Full_exhaustive generator are less performant but can always print the
terms, and hence, are generally more useful.

I was always hoping that a lifting and transfer mechanism would allow
to transfer the function space of whole theory developments into the
Andreas' FinFuns function space and make this automatic transferred
code setup usable for quickcheck. This would make the fast exhaustive
generators as powerful as the full_exhaustive ones, but (I believe)
this development has not happened yet.

In practice (thanks to much work on the IDE, non-blocking interaction
and use of multiple cores), quickcheck is powerful and useful even if
it is not the slightly optimized version yet.

Commonly, the manual and automatic setup for random and exhaustive
testing is done in one go. So both strategies would usually work and
start searching for counterexamples or both would fail, as they rely
on the same code generation setup. So then to find counterexamples
quickly, applying one strategy after the other seems awkward, and I
preferred to implement that multiple strategies can be applied in
parallel. As back in 2012, we still had a blocking IDE in wider use,
the blocking auto-quickcheck was limited to only one strategy; I
decided for exhaustive testing.

Without further investigation of the past, I cannot tell you the
reason for the definition of the current Multiset quickcheck setup.
Possibly, during some work on the new datatype package or on code
generation of multisets, some one touched the quickcheck setup, but
was not aware of the quickcheck_generator command---but that's just a
wild guess from my side; one would really need to look at the relevant
changesets and their context. The multiset theory has quite some
evolution since I looked deeply into that file.

I hope that helps.

Best regards,

Lukas




This archive was generated by a fusion of Pipermail (Mailman edition) and MHonArc.