In my previous mail i forgot to mention that the sample MPI program complies and works fine without any errors. I am kindly requesting you to please suggest how to solve the mpi problem in ALPS which i have mentioned in my previous mails
with best regards Sunil ________________________________ From: Comp-phys-alps-users [comp-phys-alps-users-bounces@lists.phys.ethz.ch] on behalf of D´Souza, Sunil Wilfred [sunilwilfred@cpfs.mpg.de] Sent: Monday, December 21, 2015 8:10 AM To: comp-phys-alps-users@lists.phys.ethz.ch Subject: Re: [ALPS-users] Problem in running the ALPS (spinmc) code in the parallel mode aith MPI
Dear ALPS users, As per your suggestion, I have tried running a simple MPI program by using the following commands
mpic++ -o monte_carlo_mpi monte_carlo_mpi.cpp
mpirun -np 8 monte_carlo_mpi
I have attached the sample MPI program ( computes PI by the Monte Carlo method) which i have used for checking the MPI. Please suggest me has to how to solve the mpi problem in ALPS simulations.
with best regards
Sunil
________________________________ From: Comp-phys-alps-users [comp-phys-alps-users-bounces@lists.phys.ethz.ch] on behalf of Matthias Troyer [troyer@phys.ethz.ch] Sent: Thursday, December 17, 2015 4:18 PM To: comp-phys-alps-users@lists.phys.ethz.ch Subject: Re: [ALPS-users] Problem in running the ALPS (spinmc) code in the parallel mode aith MPI
Have you tried running a simple MPI code that you compiled yourself using the same MPI installation?
On Dec 17, 2015, at 16:13, D´Souza, Sunil Wilfred <sunilwilfred@cpfs.mpg.demailto:sunilwilfred@cpfs.mpg.de> wrote:
Dear ALPS users, I am trying to run a spinmc simulation in the parallel mode by using the command mpiexec -np 2 spinmc --mpi parm2a.in.xml --Tmin 5
On executing the command I am getting the following error messages
[LIN21:20344] *** Process received signal *** [LIN21:20344] Signal: Segmentation fault (11) [LIN21:20344] Signal code: Address not mapped (1) [LIN21:20344] Failing at address: (nil) [LIN21:20344] [ 0] /lib64/libpthread.so.0(+0xf9f0) [0x7f85f73969f0] [LIN21:20344] [ 1] /opt/openmpi/1.4.5/gcc/lib64/openmpi/mca_pml_v.so(+0x1b62) [0x7f85f1359b62] [LIN21:20344] [ 2] /opt/openmpi/1.4.5/gcc/lib64/libopen-pal.so.0(mca_base_components_close+0x76) [0x7f85f66286c6] [LIN21:20344] [ 3] /opt/openmpi/1.4.5/gcc/lib64/libmpi.so.0(mca_pml_base_select+0x301) [0x7f85f6b2cd11] [LIN21:20344] [ 4] /opt/openmpi/1.4.5/gcc/lib64/libmpi.so.0(+0x39acc) [0x7f85f6aecacc] [LIN21:20344] [ 5] /opt/openmpi/1.4.5/gcc/lib64/libmpi.so.0(MPI_Init+0x16b) [0x7f85f6b0ca8b] [LIN21:20344] [ 6] /home/sunil/ALPS2.2/lib/libalps.so.2(_ZN4alps9comm_initERiRPPcb+0x58) [0x7f85f8327088] [LIN21:20344] [ 7] /home/sunil/ALPS2.2/lib/libalps.so.2(_ZN4alps9scheduler5startEiPPcRKNS0_7FactoryE+0x37) [0x7f85f85678d7] [LIN21:20344] [ 8] spinmc(main+0x15) [0x449a85] [LIN21:20344] [ 9] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f85f454bbe5] [LIN21:20344] [10] spinmc() [0x449dd5] [LIN21:20344] *** End of error message *** -------------------------------------------------------------------------- mpiexec noticed that process rank 0 with PID 20344 on node LIN21 exited on signal 11 (Segmentation fault). --------------------------------------------------------------------------
I have tried running a band structure code in the parallel mode using the mpirun and i am having no problems in running that code. but in the case of ALPS, I am encountering the above problem. I am attaching the cmake txt file for your consideration to make sure that all the fields are correct. I have crosschecked the MPI library path in cmake txt file and i find it to be alright. I am not able to figure out what exactly is causing this problem. I am kindly requesting you to provide me help in solving this issue.
I thank you in advance for your time and consideration.
with best regards Sunil
<CMakeCache.txt>