6

I am trying to do some OpenMP offloading to the GPU on my local machine which is employed with a GTX 1060 graphic card. All of my CUDA and Cublas examples run just fine. However, when I tried to run some OpenMP offloading it simply does not work. In order to have OpenMP 5.0 support, I compiled GCC 10.2.0 toolchain. After some debugging, I found that the OpenMP runtime does not see any devices. E.g. this code displays zero:

#include <omp.h>
#include <stdio.h>

int main() {
    printf("%d\n", omp_get_num_devices());
    return 0;
}

However, the Nvidia toolchain is up and running:

$ nvidia-smi 
Sun Feb 21 23:06:40 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.51.06    Driver Version: 450.51.06    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 106...  Off  | 00000000:1D:00.0 Off |                  N/A |
|  0%   37C    P8    12W / 200W |    584MiB /  6075MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

So what am I missing? How can be the devices found by OpenMP runtime?

EDIT:

I am appending the information about my compiler:

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/opt/gcc/10.2.0/libexec/gcc/x86_64-pc-linux-gnu/10.2.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ./configure --prefix=/opt/gcc/10.2.0/
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 10.2.0 (GCC)

The code was compiled with the following command:

gcc -fopenmp simple.c
Addman
  • 341
  • 1
  • 5
  • 13
  • Which compiler and what compile options do you use? – Hristo Iliev Feb 22 '21 at 08:24
  • I used GCC 10.2.0 with -fopenmp flag – Addman Feb 22 '21 at 08:34
  • 3
    `-fopenmp` does not enable offloading. You need to also pass `-foffload=nvptx-none` to tell GCC that you want offloading to NVIDIA devices. If you get an error that `nvptx-none` is not a supported offload target, then your GCC was not built with support for it. `gcc -v` shows the build configuration. – Hristo Iliev Feb 22 '21 at 09:07
  • You can make this in an answer. I didn't know I had to compile my compiler with offload support (and it was not so easy). It took me a day, but it's working now. I'm going to edit my question and add the outputs mentioned outputs. – Addman Feb 23 '21 at 13:16
  • There you go, an answer with a link to the GCC wiki page on offloading. – Hristo Iliev Feb 23 '21 at 16:23

1 Answers1

5

To compile OpenMP code with offloading support, you need to tell GCC the exact platform to target. This is achieved with the -foffload=<platform> command line option. For NVIDIA devices, the platform is nvptx-none, i.e., you have to compile with:

gcc -fopenmp -foffload=nvidia-ptx simple.c

Although GCC supports offloading to several target platforms, not every distribution of GCC has them enabled due to the dependencies that entails. For example, on my Arch Linux, GCC is not compiled with offloading support at all. If you receive an error executing the previous command, your GCC was not configured with support for NVIDIA. gcc -v shows you, among other things, how the compiler was configured. Look for --enable-offload-targets=nvptx-none among the configuration options.

The Offloading page on the GCC wiki provides more details about the supported offload targets and how to build them.

Hristo Iliev
  • 72,659
  • 12
  • 135
  • 186
  • Hi, When I do `-foffload=nvptx-none`, it compiles, but `omp_get_num_devices()` returns 0. With `-foffload=nvidia-ptx` I get the `GCC is not configured to support nvidia-ptx as offload target` error. – Shihab Shahriar Khan Mar 29 '23 at 21:28
  • @ShihabShahriarKhan you may need to compile your own version of GCC with proper NVidia support – Hristo Iliev Mar 31 '23 at 10:20
  • @HristoIliev, so the flag is ``-foffload=nvidia-ptx`` or ``-foffload=nvptx-none``, or are these 2 doing identical things behind the scene? thanks – velenos14 May 27 '23 at 23:21
  • 1
    @velenos14 The value after `-foffload` is the offload target name. The name could change between GCC releases, therefore one should run `gcc -v` and look for `OFFLOAD_TARGET_NAMES`. In my case (GCC 9.4.0 from stock Ubuntu package) the value is `OFFLOAD_TARGET_NAMES=nvptx-none:hsa`, so the argument would be `-foffload=nvptx-none` for NVidia PTX or `-foffload=hsa` for AMD HSAIL devices. – Hristo Iliev May 29 '23 at 07:12