Dear Alps users, I have installed ALPS2.2 on my local machine containing 12 processing units. I have tried running a simple tutorial given in the alps examples files. when i am using the following command "pyalps.runApplication('spinmc',input_file,Tmin=5)" , everything works fine and the ALPS program runs (Serial mode) and gives the proper output as expected. On the other hand when i use the command "pyalps.runApplication('spinmc',input_file,Tmin=5,MPI=7)" , i am getting a lot of error messages related to segmentation fault. The error messages are as follows:
mpirun -np 7 spinmc --mpi parm2a.in.xml --Tmin 5 [LIN21:04682] *** Process received signal *** [LIN21:04682] Signal: Segmentation fault (11) [LIN21:04682] Signal code: Address not mapped (1) [LIN21:04682] Failing at address: (nil) [LIN21:04682] [ 0] /lib64/libpthread.so.0(+0xf9f0) [0x7f356ffcf9f0] [LIN21:04682] [ 1] /opt/openmpi/1.4.5/gcc/lib64/openmpi/mca_pml_v.so(+0x1b62) [0x7f3569f92b62] [LIN21:04682] [ 2] /opt/openmpi/1.4.5/gcc/lib64/libopen-pal.so.0(mca_base_components_close+0x76) [0x7f356f2616c6] [LIN21:04682] [ 3] /opt/openmpi/1.4.5/gcc/lib64/libmpi.so.0(mca_pml_base_select+0x301) [0x7f356f765d11] [LIN21:04682] [ 4] /opt/openmpi/1.4.5/gcc/lib64/libmpi.so.0(+0x39acc) [0x7f356f725acc] [LIN21:04682] [ 5] /opt/openmpi/1.4.5/gcc/lib64/libmpi.so.0(MPI_Init+0x16b) [0x7f356f745a8b] [LIN21:04682] [ 6] /home/sunil/ALPS2.2/lib/libalps.so.2(_ZN4alps9comm_initERiRPPcb+0x58) [0x7f3570f60098] [LIN21:04682] [ 7] /home/sunil/ALPS2.2/lib/libalps.so.2(_ZN4alps9scheduler5startEiPPcRKNS0_7FactoryE+0x37) [0x7f35711a08e7] [LIN21:04682] [ 8] spinmc(main+0x15) [0x449a95] [LIN21:04682] [ 9] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f356d184be5] [LIN21:04682] [10] spinmc() [0x449de5] [LIN21:04682] *** End of error message *** -------------------------------------------------------------------------- mpirun noticed that process rank 1 with PID 4682 on node LIN21 exited on signal 11 (Segmentation fault). --------------------------------------------------------------------------
I am very kindly requesting the ALPs users to help me out in solving the above mentioned issue. I am looking forward for your suggestions to overcome this problem.
with best regards Sunil
This looks like a problem with your MPI installation since it crashes in MPI_Init. Did you try to compile and run a simple MPI program?
On 05 Dec 2014, at 07:14, D´Souza, Sunil Wilfred sunilwilfred@cpfs.mpg.de wrote:
Dear Alps users, I have installed ALPS2.2 on my local machine containing 12 processing units. I have tried running a simple tutorial given in the alps examples files. when i am using the following command "pyalps.runApplication('spinmc',input_file,Tmin=5)" , everything works fine and the ALPS program runs (Serial mode) and gives the proper output as expected. On the other hand when i use the command "pyalps.runApplication('spinmc',input_file,Tmin=5,MPI=7)" , i am getting a lot of error messages related to segmentation fault. The error messages are as follows:
mpirun -np 7 spinmc --mpi parm2a.in.xml --Tmin 5 [LIN21:04682] *** Process received signal *** [LIN21:04682] Signal: Segmentation fault (11) [LIN21:04682] Signal code: Address not mapped (1) [LIN21:04682] Failing at address: (nil) [LIN21:04682] [ 0] /lib64/libpthread.so.0(+0xf9f0) [0x7f356ffcf9f0] [LIN21:04682] [ 1] /opt/openmpi/1.4.5/gcc/lib64/openmpi/mca_pml_v.so(+0x1b62) [0x7f3569f92b62] [LIN21:04682] [ 2] /opt/openmpi/1.4.5/gcc/lib64/libopen-pal.so.0(mca_base_components_close+0x76) [0x7f356f2616c6] [LIN21:04682] [ 3] /opt/openmpi/1.4.5/gcc/lib64/libmpi.so.0(mca_pml_base_select+0x301) [0x7f356f765d11] [LIN21:04682] [ 4] /opt/openmpi/1.4.5/gcc/lib64/libmpi.so.0(+0x39acc) [0x7f356f725acc] [LIN21:04682] [ 5] /opt/openmpi/1.4.5/gcc/lib64/libmpi.so.0(MPI_Init+0x16b) [0x7f356f745a8b] [LIN21:04682] [ 6] /home/sunil/ALPS2.2/lib/libalps.so.2(_ZN4alps9comm_initERiRPPcb+0x58) [0x7f3570f60098] [LIN21:04682] [ 7] /home/sunil/ALPS2.2/lib/libalps.so.2(_ZN4alps9scheduler5startEiPPcRKNS0_7FactoryE+0x37) [0x7f35711a08e7] [LIN21:04682] [ 8] spinmc(main+0x15) [0x449a95] [LIN21:04682] [ 9] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f356d184be5] [LIN21:04682] [10] spinmc() [0x449de5] [LIN21:04682] *** End of error message ***
mpirun noticed that process rank 1 with PID 4682 on node LIN21 exited on signal 11 (Segmentation fault).
I am very kindly requesting the ALPs users to help me out in solving the above mentioned issue. I am looking forward for your suggestions to overcome this problem.
with best regards Sunil
comp-phys-alps-users@lists.phys.ethz.ch