MVAPICH on multi-GPU causes Segmentation fault

Question

I'm using MVAPICH2 2.1 on a Debian 7 machine. It has multiple cards of Tesla K40m. The code is as follows.

#include <cstdio>
#include <cstdlib>
#include <ctime>
#include <cuda_runtime.h>
#include <mpi.h>

int main(int argc, char** argv) {
    MPI_Status status;
    int rank;
    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    cudaSetDevice(0);
    if (rank == 0) {
        srand(time(0));
        float* a;
        float num = rand();
        cudaMalloc(&a, sizeof(float));
        cudaMemcpy(a, &num, sizeof(float), cudaMemcpyDefault);
        MPI_Send(a, sizeof(float), MPI_CHAR, 1, 0, MPI_COMM_WORLD);
        printf("sent %f\n", num);
    } else {
        float* a;
        float num;
        cudaMalloc(&a, sizeof(float));
        MPI_Recv(a, sizeof(float), MPI_CHAR, 0, 0, MPI_COMM_WORLD, &status);
        cudaMemcpy(&num, a, sizeof(float), cudaMemcpyDefault);
        printf("received %f\n", num);
    }
    cudaSetDevice(1);
    if (rank == 0) {
        float* a;
        float num = rand();
        cudaMalloc(&a, sizeof(float));
        cudaMemcpy(a, &num, sizeof(float), cudaMemcpyDefault);
        MPI_Send(a, sizeof(float), MPI_CHAR, 1, 0, MPI_COMM_WORLD);
        printf("sent %f\n", num);
    } else {
        float* a;
        float num;
        cudaMalloc(&a, sizeof(float));
        MPI_Recv(a, sizeof(float), MPI_CHAR, 0, 0, MPI_COMM_WORLD, &status);
        cudaMemcpy(&num, a, sizeof(float), cudaMemcpyDefault);
        printf("received %f\n", num);
    }
    MPI_Finalize();
    return 0;
}

In short, I first set device to GPU 0, send something. Then I set device to GPU 1, send something.

The output is as follows.

sent 1778786688.000000
received 1778786688.000000
[debian:mpi_rank_0][error_sighandler] Caught error: Segmentation fault (signal 11)
[debian:mpispawn_0][readline] Unexpected End-Of-File on file descriptor 7. MPI process died?
[debian:mpispawn_0][mtpmi_processops] Error while reading PMI socket. MPI process died?
[debian:mpispawn_0][child_handler] MPI process (rank: 0, pid: 30275) terminated with signal 11 -> abort job
[debian:mpirun_rsh][process_mpispawn_connection] mpispawn_0 from node debian aborted: Error while reading a PMI socket (4)

So the first send is OK. But as soon as I set my device to the other GPU, and then MPI send, boom! I wonder why this is happening.

Also, I built MVAPICH with the following command.

./configure --enable-cuda --with-cuda=/usr/local/cuda --with-device=ch3:mrail --enable-rdma-cm

I have debugging enabled and stack trace printed. Hopefully this helps..

sent 1377447040.000000
received 1377447040.000000
[debian:mpi_rank_0][error_sighandler] Caught error: Segmentation fault (signal 11)
[debian:mpi_rank_0][print_backtrace]   0: /home/lyt/local/lib/libmpi.so.12(print_backtrace+0x1c) [0x7fba26a00b3c]
[debian:mpi_rank_0][print_backtrace]   1: /home/lyt/local/lib/libmpi.so.12(error_sighandler+0x59) [0x7fba26a00c39]
[debian:mpi_rank_0][print_backtrace]   2: /lib/x86_64-linux-gnu/libpthread.so.0(+0xf8d0) [0x7fba23ffe8d0]
[debian:mpi_rank_0][print_backtrace]   3: /usr/lib/libcuda.so.1(+0x21bb30) [0x7fba26fa9b30]
[debian:mpi_rank_0][print_backtrace]   4: /usr/lib/libcuda.so.1(+0x1f6695) [0x7fba26f84695]
[debian:mpi_rank_0][print_backtrace]   5: /usr/lib/libcuda.so.1(+0x205586) [0x7fba26f93586]
[debian:mpi_rank_0][print_backtrace]   6: /usr/lib/libcuda.so.1(+0x17ad88) [0x7fba26f08d88]
[debian:mpi_rank_0][print_backtrace]   7: /usr/lib/libcuda.so.1(cuStreamWaitEvent+0x63) [0x7fba26ed72e3]
[debian:mpi_rank_0][print_backtrace]   8: /usr/local/cuda/lib64/libcudart.so.6.5(+0xa023) [0x7fba27cff023]
[debian:mpi_rank_0][print_backtrace]   9: /usr/local/cuda/lib64/libcudart.so.6.5(cudaStreamWaitEvent+0x1ce) [0x7fba27d2cf3e]
[debian:mpi_rank_0][print_backtrace]  10: /home/lyt/local/lib/libmpi.so.12(MPIDI_CH3_CUDAIPC_Rendezvous_push+0x17f) [0x7fba269f25bf]
[debian:mpi_rank_0][print_backtrace]  11: /home/lyt/local/lib/libmpi.so.12(MPIDI_CH3_Rendezvous_push+0xe3) [0x7fba269a0233]
[debian:mpi_rank_0][print_backtrace]  12: /home/lyt/local/lib/libmpi.so.12(MPIDI_CH3I_MRAILI_Process_rndv+0xa4) [0x7fba269a0334]
[debian:mpi_rank_0][print_backtrace]  13: /home/lyt/local/lib/libmpi.so.12(MPIDI_CH3I_Progress+0x19a) [0x7fba2699aeaa]
[debian:mpi_rank_0][print_backtrace]  14: /home/lyt/local/lib/libmpi.so.12(MPI_Send+0x6ef) [0x7fba268d118f]
[debian:mpi_rank_0][print_backtrace]  15: ./bin/minimal.run() [0x400c15]
[debian:mpi_rank_0][print_backtrace]  16: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fba23c67b45]
[debian:mpi_rank_0][print_backtrace]  17: ./bin/minimal.run() [0x400c5c]
[debian:mpispawn_0][readline] Unexpected End-Of-File on file descriptor 6. MPI process died?
[debian:mpispawn_0][mtpmi_processops] Error while reading PMI socket. MPI process died?
[debian:mpispawn_0][child_handler] MPI process (rank: 0, pid: 355) terminated with signal 11 -> abort job
[debian:mpirun_rsh][process_mpispawn_connection] mpispawn_0 from node debian8 aborted: Error while reading a PMI socket (4)

Have you tried removing the second half and making sure it doesn't still give you a seg fault? — Christian Sarofeen, May 26 '15 at 12:16
Does `setDevice(1)` still seg fault in a cuda only program? Also, how are you so sure? — Christian Sarofeen, May 26 '15 at 13:31
Try adding [proper cuda error checking](http://stackoverflow.com/questions/14038589/what-is-the-canonical-way-to-check-for-errors-using-the-cuda-runtime-api) to all API calls. I'm not sure there's any issue there, but it may turn up some useful clues. — Robert Crovella, May 26 '15 at 14:10
@ChristianSarofeen No.. It happens when I use a combination of CUDA and MPI. And when I switch devices.. I've done all kinds of AB tests. That's why I'm so sure... — Hot.PxL, May 26 '15 at 17:05
@RobertCrovella Hi. I checked for CUDA errors. But there is none. CUDA driver is functioning properly, with all other apps without MPI. — Hot.PxL, May 26 '15 at 17:05
What compute modes are the GPUs set to? What mpirun or mpiexec command are you using to launch this? — Robert Crovella, May 26 '15 at 17:16
@RobertCrovella What do you mean by compute modes? I use `mpirun_rsh -np 2 debian debian MV2_USE_CUDA=1 $1` to execute the binary. I ran 2 processors on the same machine. — Hot.PxL, May 27 '15 at 02:57
compute mode for a gpu is listed using the command `nvidia-smi -a`. If you do `nvidia-smi --help`, or refer to the `nvidia-smi` man page, you can get a description of compute mode. I'm wondering if it is set the same for both GPUs in question here, and what it is set to. — Robert Crovella, May 27 '15 at 04:55
@RobertCrovella Thanks for the instruction. All compute modes are set to Default. — Hot.PxL, May 27 '15 at 05:07
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/78858/discussion-between-hot-pxl-and-robert-crovella). — Hot.PxL, May 27 '15 at 05:09

Narcolessico · Accepted Answer · 2015-05-27T07:29:36.510

2

I'm afraid MVAPICH does not support yet using multiple GPUs in the same process (source: mailing list).

Advanced memory transfer operations require storing device-specific structures, so unless there is explicit support for multiple devices, I'm afraid there is no way to make your code run.

On the other side, you can of course use multiple GPU devices by running a separate process per device.

edited May 27 '15 at 07:29

answered May 27 '15 at 07:19

Narcolessico

1,921
3
19
19

Thanks. I saw your post on the mailing list. answers my question. – Hot.PxL May 27 '15 at 07:47

MVAPICH on multi-GPU causes Segmentation fault

1 Answers1