Dear ALPS users,
I have a large list of tasks run from a given job file parm.in.xml,
which got terminated after some tasks were completed; my parm.out.xml
looks like this after the termination:
===========<SNIP-1>=================
......
......
<TASK status="finished">
<INPUT file="parm.task62.out.xml"/>
</TASK>
<TASK status="finished">
<INPUT file="parm.task63.out.xml"/>
</TASK>
<TASK status="new">
<INPUT file="parm.task64.out.xml"/>
</TASK>
<TASK status="new">
......
......
============<SNIP-1>================
To restart the job from task64, I execute the command (without
changing anything in the output files)
worm --Tmin 10 parm.out.xml
Could you please confirm that this is correct? I ask because nothing
seems to proceed after a while after executing the above:
============<SNIP-2>===============
.........
.........
Loading information about run 1 from file
parm.task62.out.run1
Loading information about run 1 from file
parm.task63.out.run1
Task 1 finished.
........
........
Task 63 finished.
Created run 1 locally
Starting task 64.
Checking if it is finished: not yet, next check in 10 seconds
( 0% done).
Checking if it is finished: not yet, next check in 203
seconds ( 1% done).
Checking if it is finished: not yet, next check in 174
seconds ( 23% done).
=============<SNIP-2>==============
From thereon the simulation just hangs for many hours (I've tried
this procedure repeatedly), which it should not because my SWEEPS
and THERMALIZATION are not excessively large.
Any ideas on why the re-started simulations don't seem to proceed
are appreciated.
With regards,
Vipin