Dear developers,
I recently checked out the newest version of the Alps code rev. 7291 and compiled the Hybridization expansion Impurity solver with the Intel compiler composer_xe_2013.3.163 on Ubuntu 12.04 LTS.
My problem is that the code takes longer to exit than specified in MAX_TIME in the parameter file (SWEEPS much higher than possible in this time). I.e.
MAX_TIME = 30 --> runs for 60sec MAX_TIME = 60 --> runs for 180sec MAX_TIME = 120 --> runs for 180sec MAX_TIME = 180 --> runs for 420sec MAX_TIME = 240 --> runs for 420sec
Also for longer time and mpi-parallelization the behaviour is similar. With an older version checked out about 2 months ago this problem did not occur. Could you check if this problem is caused by some change in the code?
Thanks, Steffen
Hi Steffen,
I don’t know what’s going on but I’ll look into it. The underlying framework may have changed how often it checks if time’s up.
our ‘fraction_completed’ command uses only sweeps, not time, to see how far along we are in the simulation:
hyb.hpp:61: double fraction_completed() const {return is_thermalized()?(sweeps-thermalization_sweeps)/(double)total_sweeps:0.; }
if the simulation stops because time’s up then we can change that (or add another criterion) to exit based on time.
does anybody know what changed in ngs?
Emanuel
On Jan 3, 2014, at 7:25 PM, backes@th.physik.uni-frankfurt.de wrote:
Dear developers,
I recently checked out the newest version of the Alps code rev. 7291 and compiled the Hybridization expansion Impurity solver with the Intel compiler composer_xe_2013.3.163 on Ubuntu 12.04 LTS.
My problem is that the code takes longer to exit than specified in MAX_TIME in the parameter file (SWEEPS much higher than possible in this time). I.e.
MAX_TIME = 30 --> runs for 60sec MAX_TIME = 60 --> runs for 180sec MAX_TIME = 120 --> runs for 180sec MAX_TIME = 180 --> runs for 420sec MAX_TIME = 240 --> runs for 420sec
Also for longer time and mpi-parallelization the behaviour is similar. With an older version checked out about 2 months ago this problem did not occur. Could you check if this problem is caused by some change in the code?
Thanks, Steffen
Hi
In ngs the check of sweeps and the elapsed time/termination signals are separated. The simulation classes provieds a fraction_complete function which checks the progress in sweeps and the callback, passed to the run function (normaly apls::check_callback) checks the time and the termination signals. If the application runs too long than either the check callback does not check the time properly or the check schedule does not often enough check the progress.
The check_schedule can be initialized with the mpi adapter:
mcmpiadapter(parameters, mpi_communicator, check_schedule(tmin, tmax));
Best Lukas
2014/1/4 Emanuel Gull emanuel.gull@gmail.com
Hi Steffen,
I don’t know what’s going on but I’ll look into it. The underlying framework may have changed how often it checks if time’s up.
our ‘fraction_completed’ command uses only sweeps, not time, to see how far along we are in the simulation:
hyb.hpp:61: double fraction_completed() const {return is_thermalized()?(sweeps-thermalization_sweeps)/(double)total_sweeps:0.; }
if the simulation stops because time’s up then we can change that (or add another criterion) to exit based on time.
does anybody know what changed in ngs?
Emanuel
On Jan 3, 2014, at 7:25 PM, backes@th.physik.uni-frankfurt.de wrote:
Dear developers,
I recently checked out the newest version of the Alps code rev. 7291 and compiled the Hybridization expansion Impurity solver with the Intel compiler composer_xe_2013.3.163 on Ubuntu 12.04 LTS.
My problem is that the code takes longer to exit than specified in MAX_TIME in the parameter file (SWEEPS much higher than possible in this time). I.e.
MAX_TIME = 30 --> runs for 60sec MAX_TIME = 60 --> runs for 180sec MAX_TIME = 120 --> runs for 180sec MAX_TIME = 180 --> runs for 420sec MAX_TIME = 240 --> runs for 420sec
Also for longer time and mpi-parallelization the behaviour is similar. With an older version checked out about 2 months ago this problem did not occur. Could you check if this problem is caused by some change in the code?
Thanks, Steffen
Hi Lukas,
thanks for your message! Yes, this was the problem. The callback function does indeed check for the time, but the schedule_checker is updated only with the fraction of sweeps. Therefore, it sets the next check to 1/4 of the approximate time left, based on the fraction of sweeps done and not on time. So the checking interval is usually increased even if there is, e.g., only one second left for the calculation. By this the solver checks to late for MAX_TIME. I already checked in a modification where the schedule_checker is updated like this
schedule_checker.update( std::max(fraction, time_fraction) );
which works quite ok.
best wishes, Steffen
Hi
In ngs the check of sweeps and the elapsed time/termination signals are separated. The simulation classes provieds a fraction_complete function which checks the progress in sweeps and the callback, passed to the run function (normaly apls::check_callback) checks the time and the termination signals. If the application runs too long than either the check callback does not check the time properly or the check schedule does not often enough check the progress.
The check_schedule can be initialized with the mpi adapter:
mcmpiadapter(parameters, mpi_communicator, check_schedule(tmin, tmax));
Best Lukas
2014/1/4 Emanuel Gull emanuel.gull@gmail.com
Hi Steffen,
I don’t know what’s going on but I’ll look into it. The underlying framework may have changed how often it checks if time’s up.
our ‘fraction_completed’ command uses only sweeps, not time, to see how far along we are in the simulation:
hyb.hpp:61: double fraction_completed() const {return is_thermalized()?(sweeps-thermalization_sweeps)/(double)total_sweeps:0.; }
if the simulation stops because time’s up then we can change that (or add another criterion) to exit based on time.
does anybody know what changed in ngs?
Emanuel
On Jan 3, 2014, at 7:25 PM, backes@th.physik.uni-frankfurt.de wrote:
Dear developers,
I recently checked out the newest version of the Alps code rev. 7291
and
compiled the Hybridization expansion Impurity solver with the Intel compiler composer_xe_2013.3.163 on Ubuntu 12.04 LTS.
My problem is that the code takes longer to exit than specified in MAX_TIME in the parameter file (SWEEPS much higher than possible in
this
time). I.e.
MAX_TIME = 30 --> runs for 60sec MAX_TIME = 60 --> runs for 180sec MAX_TIME = 120 --> runs for 180sec MAX_TIME = 180 --> runs for 420sec MAX_TIME = 240 --> runs for 420sec
Also for longer time and mpi-parallelization the behaviour is similar. With an older version checked out about 2 months ago this problem did
not
occur. Could you check if this problem is caused by some change in the code?
Thanks, Steffen
No, this has to be done differently, since the maxtime is given as a command line parameter we do not want a redundant parameter in the parameter file. If this is a problem we need to change the callback to return a float or to check the callback more often than the fraction_complete. For a quick fix I sugest to set the Tmax parameter of the schedule_checker to a smaler value like 60s.
Best Lukas
2014/1/6 backes@th.physik.uni-frankfurt.de
Hi Lukas,
thanks for your message! Yes, this was the problem. The callback function does indeed check for the time, but the schedule_checker is updated only with the fraction of sweeps. Therefore, it sets the next check to 1/4 of the approximate time left, based on the fraction of sweeps done and not on time. So the checking interval is usually increased even if there is, e.g., only one second left for the calculation. By this the solver checks to late for MAX_TIME. I already checked in a modification where the schedule_checker is updated like this
schedule_checker.update( std::max(fraction, time_fraction) );
which works quite ok.
best wishes, Steffen
Hi
In ngs the check of sweeps and the elapsed time/termination signals are separated. The simulation classes provieds a fraction_complete function which checks the progress in sweeps and the callback, passed to the run function (normaly apls::check_callback) checks the time and the termination signals. If the application runs too long than either the check callback does not check the time properly or the check schedule does not often enough check the progress.
The check_schedule can be initialized with the mpi adapter:
mcmpiadapter(parameters, mpi_communicator, check_schedule(tmin, tmax));
Best Lukas
2014/1/4 Emanuel Gull emanuel.gull@gmail.com
Hi Steffen,
I don’t know what’s going on but I’ll look into it. The underlying framework may have changed how often it checks if time’s up.
our ‘fraction_completed’ command uses only sweeps, not time, to see how far along we are in the simulation:
hyb.hpp:61: double fraction_completed() const {return is_thermalized()?(sweeps-thermalization_sweeps)/(double)total_sweeps:0.; }
if the simulation stops because time’s up then we can change that (or add another criterion) to exit based on time.
does anybody know what changed in ngs?
Emanuel
On Jan 3, 2014, at 7:25 PM, backes@th.physik.uni-frankfurt.de wrote:
Dear developers,
I recently checked out the newest version of the Alps code rev. 7291
and
compiled the Hybridization expansion Impurity solver with the Intel compiler composer_xe_2013.3.163 on Ubuntu 12.04 LTS.
My problem is that the code takes longer to exit than specified in MAX_TIME in the parameter file (SWEEPS much higher than possible in
this
time). I.e.
MAX_TIME = 30 --> runs for 60sec MAX_TIME = 60 --> runs for 180sec MAX_TIME = 120 --> runs for 180sec MAX_TIME = 180 --> runs for 420sec MAX_TIME = 240 --> runs for 420sec
Also for longer time and mpi-parallelization the behaviour is similar. With an older version checked out about 2 months ago this problem did
not
occur. Could you check if this problem is caused by some change in the code?
Thanks, Steffen
comp-phys-alps-users@lists.phys.ethz.ch