3

I have a c++ solver which I need to run in parallel using the following command:

nohup mpirun -np 16 ./my_exec > log.txt &

This command will run my_exec independently on the 16 processors available on my node. This used to work perfectly.

Last week, the HPC department performed an OS upgrade and now, when launching the same command, I get two warning messages (for each processor). The first one is:

--------------------------------------------------------------------------                           
2 WARNING: It appears that your OpenFabrics subsystem is configured to only                            
3 allow registering part of your physical memory.  This can cause MPI jobs to                          
4 run with erratic performance, hang, and/or crash.                                                    
5                                                                                                      
6 This may be caused by your OpenFabrics vendor limiting the amount of                                 
7 physical memory that can be registered.  You should investigate the                                  
8 relevant Linux kernel module parameters that control how much physical                               
9 memory can be registered, and increase them to allow registering all                                 
10 physical memory on your machine.                                                                     
11                                                                                                      
12 See this Open MPI FAQ item for more information on these Linux kernel module                         
13 parameters:                                                                                          
14                                                                                                      
15     http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages                                
16                                                                                                      
17   Local host:              tamnun                                                                    
18   Registerable memory:     32768 MiB                                                                 
19   Total memory:            98294 MiB                                                                 
20                                                                                                      
21 Your MPI job will continue, but may be behave poorly and/or hang.                                    
22 --------------------------------------------------------------------------                           
23 --------------------------------------------------------------------------        

I then get an output from my code, which tells me it thinks I am launching only 1 realization of the code (Nprocs = 1 instead of 16).

177                                                                                                      
178 # MPI IS ON; Nprocs = 1                                                                              
179 Filename = ../input/odtParam.inp                                                                     
180                                                                                                      
181 # MPI IS ON; Nprocs = 1                                                                              
182                                                                                                      
183 ***** Error, process 0 failed to create ../data/data_0/, or it was already there

Finally, the second warning message is:

185 --------------------------------------------------------------------------                           
186 An MPI process has executed an operation involving a call to the                                     
187 "fork()" system call to create a child process.  Open MPI is currently                               
188 operating in a condition that could result in memory corruption or                                   
189 other system errors; your MPI job may hang, crash, or produce silent                                 
190 data corruption.  The use of fork() (or system() or other calls that                                 
191 create child processes) is strongly discouraged.                                                     
192                                                                                                      
193 The process that invoked fork was:                                                                   
194                                                                                                      
195   Local host:          tamnun (PID 17446)                                                            
196   MPI_COMM_WORLD rank: 0                                                                             
197                                                                                                      
198 If you are *absolutely sure* that your application will successfully                                 
199 and correctly survive a call to fork(), you may disable this warning                                 
200 by setting the mpi_warn_on_fork MCA parameter to 0.                                                  
201 --------------------------------------------------------------------------     

After looking around online, I tried following the warning messages' advice by setting the MCA parameter mpi_warn_on_fork to 0 with the command:

nohup mpirun --mca mpi_warn_on_fork 0 -np 16 ./my_exec > log.txt &

which yielded the following error message:

[mpiexec@tamnun] match_arg (./utils/args/args.c:194): unrecognized argument mca
[mpiexec@tamnun] HYDU_parse_array (./utils/args/args.c:214): argument matching returned error
[mpiexec@tamnun] parse_args (./ui/mpich/utils.c:2964): error parsing input array
[mpiexec@tamnun] HYD_uii_mpx_get_parameters (./ui/mpich/utils.c:3238): unable to parse user arguments

I am using RedHat 6.7 (Santiago). I contacted the HPC department, but since I am in a university, this issue may take them a day or two to respond. Any help or guidance would be appreciated.

EDIT in response to answer:

Indeed, I was compiling my code with Open MPI's mpic++ while running the executable with Intel's mpirun command, hence the error (after the OS upgrade Intel's mpirun was set as the default). I had to put the Open MPI's mpirun's path at the beginning of the $PATH environmental variable.

The code now runs as expected BUT I still get the first warning message above (it does not advise me to use the MCA parameter mpi_warn_on_fork anymore. I think (but not sure) it is an issue I need to resolve with the HPC department.

solalito
  • 1,189
  • 4
  • 19
  • 34

1 Answers1

4
[mpiexec@tamnun] match_arg (./utils/args/args.c:194): unrecognized argument mca
[mpiexec@tamnun] HYDU_parse_array (./utils/args/args.c:214): argument matching returned error
[mpiexec@tamnun] parse_args (./ui/mpich/utils.c:2964): error parsing input array
                                  ^^^^^
[mpiexec@tamnun] HYD_uii_mpx_get_parameters (./ui/mpich/utils.c:3238): unable to parse user arguments
                                                  ^^^^^

You are using MPICH in the last case. MPICH is not Open MPI and its process launcher does not recognize the --mca parameter that is specific to Open MPI (MCA stands for Modular Component Architecture - the basic framework that Open MPI is built upon). A typical case of a mix-up of multiple MPI implementations.

Hristo Iliev
  • 72,659
  • 12
  • 135
  • 186
  • Thanks for your answer! However, I'm sure where to start to fix it. Any advice? – solalito Nov 18 '15 at 16:44
  • Start by finding out what MPI implementations are installed on the machine and how you can switch between them. Also, make sure that you use `mpirun` from the same implementation that was used to compile the program. Compiling against Open MPI and running with the MPICH runtime (or vice versa) simply doesn't work and you get as a result a bunch of singleton processes, all having rank of 0 in their own `MPI_COMM_WORLD`s as you have already observed. – Hristo Iliev Nov 18 '15 at 21:18
  • See my edit. I will accept your answer as it put me on the right track (solution was easy once the problem was diagnosed). – solalito Nov 19 '15 at 08:58
  • The warning stems from misconfigured InfiniBand drivers. See [this question](http://stackoverflow.com/questions/17755433/how-can-i-increase-openfabrics-memory-limit-for-torque-jobs) and also the [Open MPI FAQ on InfiniBand](http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages). Fixing that requires intervention from the system administrators (it involves changes to certain kernel module parameters), so you should push your HPC department to do it. – Hristo Iliev Nov 19 '15 at 09:06
  • From what I understand, the default locked memory made available on my node is too low and need to be raised (in the warning message, registerable memory is lower than total memory, hence the issue). However, `unlimited -l` returns `unlimited` – solalito Nov 19 '15 at 09:21
  • 1
    No, it is about the configuration of the InfiniBand module. Each InfiniBand adapter has its own virtual-to-physical address translation unit (MMU), similar to the one in the CPU. It allows the adapter to work directly with process virtual addresses instead of with physical ones via a process known as memory registration, which greatly simplifies the API as many things could be done from userspace. But for that to work, the MMU on the InfiniBand has to have enough translation table entries to cover the entire RAM. This is usually controlled by a parameter to the kernel module. – Hristo Iliev Nov 19 '15 at 09:49
  • 1
    Registering memory is a two step process. First, the data has to be fixed in physical memory by locking it, hence you need to remove the limit on locked memory via `ulimit -l unlimited`. Then, the InfiniBand driver builds translation table entries that describe the physical memory locations. If the table does not have enough entries, the driver won't be able to simultaneously register enough physical memory, which could lead to problems if your application is sending large amounts of data around. It is just a warning and most programs will simply work as is. – Hristo Iliev Nov 19 '15 at 09:59
  • Thanks for the extensive answer! I emailed my administrators and hopefully they will fix it. – solalito Nov 19 '15 at 10:20