Re: [isabelle] Any fans of "try"?



I use "try" routinely and find it essential.  Here are a few notes:

(1)  I find it much more convenient to type in "try" directly into the input while trying
	to sketch out a proof development than to use the separate Sledgehammer interface.
	(To me, the latter seems to be more useful for searching for perhaps better or faster
	proofs without disturbing the existing text.)

        I use "try" during proof development to try to see if a proof step can be quickly
	verified as true, without necessarily having to tediously decompose it and write
	out all the steps.  I am willing to wait a little bit for this, though it tends to
	come with some risk of overloading the prover (more on this below).  Many times,
	a proof by "smt" is produced, which I keep temporarily and then later improve,
	either again using "try" or sledgehammer to find a better proof (which is often
	possible later due to additional facts that I added while fleshing out the theory),
	or by explicitly expanding the proof to eliminate the use of "smt".

(2)  "try" is extremely helpful for becoming familiar with the behavior of the various
	proof methods, and with theories one has not used before.  Of course, one must
	not expect it to do magic, even though sometimes it seems like it does.

(3)  "try" will find proofs that "try0" misses, due to apparent time limits in "try0".
	For example, "try0" will typically not find find proofs by "force" or "fastforce"
	that take ten seconds or more, even if all the necessary facts have been given.
	The limits seem to be removed in "try", which allows the long proofs to be found.

(4)  There appear to be some strange anomalies in "try" regarding the scheduling of the
	various proof methods it tries.  One would expect that "try" would run the various
	methods concurrently as much as possible.  However, in some situations, "try" can
	run indefinitely without finding a proof, whereas "try0" will immediately report
	a proof by "force" or "fastforce" that takes only a few tens of milliseconds.

(5)  "Heat" while using "try" seems to be related less to CPU usage than it does to memory
	usage.  Using "try" can quickly blow up the prover heap and put the computer into
	thrashing conditions, unless suitable limits have been imposed when invoking Isabelle.
	The desktop that I generally use for proof development has 24GB of RAM and runs Ubuntu.
	I have typically limited the ML heap to 8GB and the Java heap to 4GB and this generally
	allows me to do other things like browse the web and read email along with running
	Isabelle, without entering a thrashing situation.  Lately I have found it necessary
	to increase the ML heap limit to 12GB to avoid excessive garbage collection,
	and this will produce thrashing sometimes, especially in conjunction with the ridiculous
	memory consumption of Firefox and Thunderbird.

	Note that "try" can be less than useless on a memory-poor system.  Although I was
	successfully using a 6GB laptop for some time, I always had to be careful about using "try"
	with it, which put a significant hitch into my normal development routine.
	About two years ago I found that I was no longer able to do anything useful on
	that platform.  At the moment, 16GB is still a useful configuration, but barely.

(6)  I believe there to be issues in the infrastructure regarding the cancellation of a
	"try" that has already been started.  With a fair amount of frequency, if you delete
	a running "try" from the document, the search will not be cancelled immediately,
	but rather Poly will run for arbitrarily long amounts of time at high CPU,
	apparently until the originally scheduled tasks have completed.  In previous
	versions of Isabelle, I have observed an apparent blow-up in the scheduling of
	future tasks that can produce a backlog that can take many minutes to clear, though
	this seems to have improved in recent Isabelle versions.

	It is difficult and frustrating to try to understand what is going on during periods
	of overload, because the "Isabelle monitor" does not get any data from Poly during
	precisely the periods of time when you want to know what is happening.
	So all you see is happy, normal behavior, computing is going on, garbage collection
	is taking place regularly and not too frequently, and the number of queued tasks is
	stable, and then when an overload sets in the display stops for the duration.
	When the overload clears, you just get a straight line drawn from the conditions before
	to the conditions after, so you learn nothing.

(7)  When the prover seems to have been spammed as a result of "try", I often have to make a
	guess about whether it will be faster to wait for the overload to clear, or to just
	restart Isabelle and re-check the context from the beginning.

My conclusion:  I find "try" to be extremely useful, and would be very unhappy to see it go away.
However, I suspect that fixes to some of the issues I have described above would make it less
touchy to use.

									- Gene Stark


On 5/26/20 6:29 AM, Jasmin Blanchette via Cl-isabelle-users wrote:
> Dear all,
> 
> The "try0" and "try" commands were introduced to give an easy way to try out various tactics and tools. "try0" only applies a selection of tactics, whereas "try" additionally runs "solve_direct", Quickcheck, and reduced versions of Nitpick and Sledgehammer.
> 
> I know people use "try0", including some experts, but I'm trying to find out whether "try" is used. The feedback I have about it, as well as my own impression, is that it just uses too much CPU at the same time to be useful, making the machine overheat. Also, because Nitpick and Sledgehammer are not the real thing, after a failed "try" one might still want to run them (esp. Sledgehammer), so "try" is not helping much. Finally, "solve_direct" and Auto Quickcheck are run by default, and you can also enable Auto Nitpick (which I've done for years), so in the end, "try" is really just "try0 + reduced Sledgehammer + lots of heat".
> 
> My proposal:
> 
> 1. Remove the "try" command.
> 
> 2. Rename "try0" to "try".
> 
> 3. (Maybe?) Enable Auto Nitpick.
> 
> (Auto Nitpick is a reduced version of Nitpick that doesn't take that much CPU and won't run for more than 5 s. It's useful on nonexecutable conjectures like "hd [] = 0" and other problems with an axiomatic flavor.)
> 
> Any opinions?
> 
> Jasmin
> 
> 





This archive was generated by a fusion of Pipermail (Mailman edition) and MHonArc.