[isabelle] Isabelle2013-2-RC3 ATP e_par acid test



Makarius,

The short story is that I can reliably get non-terminating processes (or processes that can't be shut down, and that will run for a long time), but so far, I can only do that with the ATP e_par, and when it's started with a bunch of other ATPs.

The theorem I use for the test has to be put it at the bottom of a theory, so that Sledgehammer can find some relevant facts.

The test is to start Sledgehammer, and then terminate it prematurely by deleting the "r" in "Sledgehammer", sometimes soon, and sometimes later. I actually have to let e_par run for a while to get the right conditions for it to not terminate.

I didn't spend a lot of time experimenting with the command I show next, which doesn't use e_par, but it uses most every ATP that's available to me, and I haven't seen any problems from prematurely terminating the Sledgehammer command. This is the command:

   ML {* Multithreading.max_threads_value () *}
   ML {* Multithreading.max_threads := 300 *}
   ML {* Multithreading.max_threads_value () *}

   theorem "False = True"
   sledgehammer [minimize=smart,preplay_timeout=10,timeout=60,verbose=true,
                 isar_try0=true,isar_proofs=smart,provers="
      remote_vampire  metis  remote_satallax  z3_tptp  remote_e  smt
      remote_e_tofof  spass  remote_e_sine    e        z3        yices

remote_snark remote_iprover remote_z3 remote_waldmeister spass_poly remote_cvc3 remote_agsyhol remote_leo2 remote_iprover_eq leo2 cvc3
   "]
   oops

From here, two key points are that e_par starts up 12 eprover.exe processes regardless of what max_threads is set to, that the number and order of ATPs run with it affect things, and "max_thread := 4" is what sets up the particular sequence I list below.

I give the Isar command below, but I work it like this:

1) I start the command.
2) Lots of eprover processes will be started and terminated, several times.
3) At about 40 seconds using my CPU, the ATP "leo" will be running with 12 eprover.exe processes, during a time when processes aren't being shut down and started (shown by green and red lines using http://systemexplorer.net/ )
4) I delete the "r" in "Sledgehammer".
5) The processes keep running much long than what I'm willing to wait. (I have waited up to 15 minutes to see if they would terminate, and they didn't.) 6) I exit the PIDE. Processes are terminated except for the ATP related processes. I attach a screen shot showing the many non-terminating processes.

The command is this:

   ML {* Multithreading.max_threads_value () *}
   ML {* Multithreading.max_threads := 4 *}
   ML {* Multithreading.max_threads_value () *}

   theorem "False = True"

   sledgehammer [minimize=smart,preplay_timeout=10,timeout=60,verbose=true,
                 isar_try0=true,isar_proofs=smart,provers="
      e_par e leo2 z3 metis z3_tptp spass yices cvc3 smt spass_poly
   "]

   oops

The steps listed above don't always work, but they probably work more than 50% of the time.

In the PIDE options, the default for "Threads" is "0", which is what controlled my "max_thread" until I started changing "max_threads" with the ML command, which affected me being able to get the sequence of events to happen as I described above. The value "max_threads := 4" gets the same results as the "Threads" set to "0". I have a 4 core CPU with no hyperthreading.

This is no problem for me. The ATP e_par actually terminates much of the time, and I haven't been using it because of the large number of processes that it starts.

Regards,
GB













Attachment: 131204_non-term-processes.png
Description: PNG image



This archive was generated by a fusion of Pipermail (Mailman edition) and MHonArc.