In my previous mail i forgot to mention that the sample MPI program complies and works fine without any errors. I am kindly requesting you to please suggest how to solve the mpi problem in ALPS which i have mentioned in my previous mails
with best regards
Sunil
From: Comp-phys-alps-users [comp-phys-alps-users-bounces@lists.phys.ethz.ch] on behalf of D´Souza, Sunil Wilfred [sunilwilfred@cpfs.mpg.de]
Sent: Monday, December 21, 2015 8:10 AM
To: comp-phys-alps-users@lists.phys.ethz.ch
Subject: Re: [ALPS-users] Problem in running the ALPS (spinmc) code in the parallel mode aith MPI
Dear ALPS users,
As per your suggestion, I have tried running a simple MPI program by using the following commands
mpic++ -o monte_carlo_mpi monte_carlo_mpi.cpp
mpirun -np 8 monte_carlo_mpi
I have attached the sample MPI program ( computes PI by the Monte Carlo method) which i have used for checking the MPI. Please suggest me has to how to solve the mpi problem in ALPS simulations.
with best regards
Sunil
From: Comp-phys-alps-users [comp-phys-alps-users-bounces@lists.phys.ethz.ch] on behalf of Matthias Troyer [troyer@phys.ethz.ch]
Sent: Thursday, December 17, 2015 4:18 PM
To: comp-phys-alps-users@lists.phys.ethz.ch
Subject: Re: [ALPS-users] Problem in running the ALPS (spinmc) code in the parallel mode aith MPI
Have you tried running a simple MPI code that you compiled yourself using the same MPI installation?Dear ALPS users,
I am trying to run a spinmc simulation in the parallel mode by using the command
mpiexec -np 2 spinmc --mpi parm2a.in.xml --Tmin 5
On executing the command I am getting the following error messages
[LIN21:20344] *** Process received signal ***
[LIN21:20344] Signal: Segmentation fault (11)
[LIN21:20344] Signal code: Address not mapped (1)
[LIN21:20344] Failing at address: (nil)
[LIN21:20344] [ 0] /lib64/libpthread.so.0(+0xf9f0) [0x7f85f73969f0]
[LIN21:20344] [ 1] /opt/openmpi/1.4.5/gcc/lib64/openmpi/mca_pml_v.so(+0x1b62) [0x7f85f1359b62]
[LIN21:20344] [ 2] /opt/openmpi/1.4.5/gcc/lib64/libopen-pal.so.0(mca_base_components_close+0x76) [0x7f85f66286c6]
[LIN21:20344] [ 3] /opt/openmpi/1.4.5/gcc/lib64/libmpi.so.0(mca_pml_base_select+0x301) [0x7f85f6b2cd11]
[LIN21:20344] [ 4] /opt/openmpi/1.4.5/gcc/lib64/libmpi.so.0(+0x39acc) [0x7f85f6aecacc]
[LIN21:20344] [ 5] /opt/openmpi/1.4.5/gcc/lib64/libmpi.so.0(MPI_Init+0x16b) [0x7f85f6b0ca8b]
[LIN21:20344] [ 6] /home/sunil/ALPS2.2/lib/libalps.so.2(_ZN4alps9comm_initERiRPPcb+0x58) [0x7f85f8327088]
[LIN21:20344] [ 7] /home/sunil/ALPS2.2/lib/libalps.so.2(_ZN4alps9scheduler5startEiPPcRKNS0_7FactoryE+0x37) [0x7f85f85678d7]
[LIN21:20344] [ 8] spinmc(main+0x15) [0x449a85]
[LIN21:20344] [ 9] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f85f454bbe5]
[LIN21:20344] [10] spinmc() [0x449dd5]
[LIN21:20344] *** End of error message ***
--------------------------------------------------------------------------
mpiexec noticed that process rank 0 with PID 20344 on node LIN21 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
I have tried running a band structure code in the parallel mode using the mpirun and i am having no problems in running that code. but in the case of ALPS, I am encountering the above problem. I am attaching the cmake txt file for your consideration to make sure that all the fields are correct. I have crosschecked the MPI library path in cmake txt file and i find it to be alright. I am not able to figure out what exactly is causing this problem.
I am kindly requesting you to provide me help in solving this issue.
I thank you in advance for your time and consideration.
with best regards
Sunil
<CMakeCache.txt>