Please use the new directed worm algorithm code instead. Tama Ma can help you get started. We will remove the old worm code soon
Matthias
On 02 Apr 2014, at 14:58, vvarma@ictp.it wrote:
Dear ALPS users,
More info on the previous problem (summary: an executed job-file with many tasks hangs at a particular task upon re-starting the job-file, as follows):
- If I pick out the parameter file of the hung task from the job-file and run it separately, the simulation of this particular task goes through. So there's no fundamental problem with the task(s) being run.
- If I drastically reduce the SWEEPS and THERMALIZATION of just the hung task in the job-file and re-start the job-file, the hung task and all succeeding tasks go through. Which is surprising because all the tasks of a job file are more or less the same (small temperature differences, random seeds, same SWEEPS and THERMALIZATION etc.)
- Parameters of an example hung task:
================<SNIP>======================
LATTICE="inhomogeneous simple cubic lattice periodic"; L=14;
MODEL="hardcore boson"; V=0; t=1.0; NONLOCAL=0;
T=1.32; SWEEPS=50000; THERMALIZATION=100000;
{DISORDERSEED=69830; mu=8*2*(random()-0.5);}
==================<SNIP>=====================
I appreciate any ideas on why when I re-start a job-file, the first task to be run always hangs (blocking everything else) but the same task goes through when run separately.
Thanks, Vipin
On 03/31/2014 12:07 PM, vvarma@ictp.it wrote:
Dear ALPS users,
I have a large list of tasks run from a given job file parm.in.xml, which got terminated after some tasks were completed; my parm.out.xml looks like this after the termination:
===========<SNIP-1>================= ...... ......
<TASK status="finished"> <INPUT file="parm.task62.out.xml"/> </TASK> <TASK status="finished"> <INPUT file="parm.task63.out.xml"/> </TASK> <TASK status="new"> <INPUT file="parm.task64.out.xml"/> </TASK> <TASK status="new"> ...... ...... ============<SNIP-1>================
To restart the job from task64, I execute the command (without changing anything in the output files)
worm --Tmin 10 parm.out.xml
Could you please confirm that this is correct? I ask because nothing seems to proceed after a while after executing the above:
============<SNIP-2>=============== ......... ......... Loading information about run 1 from file parm.task62.out.run1 Loading information about run 1 from file parm.task63.out.run1 Task 1 finished. ........ ........
Task 63 finished. Created run 1 locally Starting task 64. Checking if it is finished: not yet, next check in 10 seconds ( 0% done). Checking if it is finished: not yet, next check in 203 seconds ( 1% done). Checking if it is finished: not yet, next check in 174 seconds ( 23% done).
=============<SNIP-2>==============
From thereon the simulation just hangs for many hours (I've tried this procedure repeatedly), which it should not because my SWEEPS and THERMALIZATION are not excessively large.
Any ideas on why the re-started simulations don't seem to proceed are appreciated.
With regards, Vipin