Re: [isabelle] Greyout followed by gc error



The garbage collector has got increasingly complex particularly now that it has been parallellised. It relies on various properties of the underlying memory to avoid too much locking. Bugs in the GC can result in the memory getting messed up leading to random errors and segfaults at some indeterminate point in the future. To try and capture errors earlier rather than later there are many assertion checks that check for consistency.

This particular assertion fault seems to happen under heavy load. I've been aware of it but I've never been able to reproduce it myself. Without a core dump with the debugging symbols included it's virtually impossible to see what is happening. The fault itself indicates that a value is not what it should be but not how it got there. I was hoping that perhaps there would be an example that would demonstrate the fault sufficiently consistently that I would be able to reproduce it.

The fault, though, is in a piece of code that was added to deal with large vectors/arrays which were previously very inefficient. It would be possible to take that out. I think the best solution is to try and make some effort to try and fix the fault and if that fails to remove the code.

David

On 14/01/2015 16:02, Peter Lammich wrote:
For me it looks like there is some bug in the PolyML garbage collector,
which is likely to be triggered under heavy parallel load. David??

I get this assertion sometimes (3 times in the last month) if I open a
theory that depends on a whole bunch of new theories.

My machine is

Intel(R) Core(TM) i7-2720QM CPU @ 2.20GHz with 4 cores (i.e. 8 cores
with hyperthreading), with 8GB of RAM


--
   Peter


On Mi, 2015-01-14 at 16:40 +0100, Joachim Breitner wrote:
Hi,

I just had a greyout followed, after a while, by this message:

Welcome to Isabelle/HOLCF-Nominal2 (Isabelle2014: August 2014)
poly: gc_mark_phase.cpp:439: virtual void MTGCProcessMarkPointers::ScanAddressesInObject(PolyObject*, POLYUNSIGNED): Assertion `baseAddr > (PolyWord*)obj && baseAddr < ((PolyWord*)obj)+length' failed.

Malformed message:
bad chunk (unexpected EOF after 136 of 618 bytes)
message_output terminated
/opt/isabelle/Isabelle2014/lib/scripts/run-polyml-5.5.2: Zeile 84:  7181 Abgebrochen             "$POLY" -q -i $ML_OPTIONS --eval "$(perl "$ISABELLE_HOME/lib/scripts/recode.pl" "$MLTEXT")" --error-exit < /dev/null

standard_error terminated
standard_output terminated
process terminated
command_input terminated
process_manager terminated
Return code: 134


This happened during a refactoring phase, where I have a number of
„target“ theories ticked in the theory panel, which obviously puts some
strain on the system. I was disabling and enabling the global
„Continuous checking“ parameter before, which might be related.

This is a Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz machine with 4 cores,
8GB or RAM.

Greetings,
Joachim







This archive was generated by a fusion of Pipermail (Mailman edition) and MHonArc.