3

There are four CUDA-capable devices available:

teslabot$ ./deviceQuery | grep -i "device [0-9]\|capability"
Device 0: "Tesla C2050 / C2070"
  CUDA Capability Major/Minor version number:    2.0
Device 1: "Tesla C2050 / C2070"
  CUDA Capability Major/Minor version number:    2.0
Device 2: "GeForce GTX 295"
  CUDA Capability Major/Minor version number:    1.3
Device 3: "GeForce GTX 295"
  CUDA Capability Major/Minor version number:    1.3

cuda-dbg sees only one of them:

teslabot$ cuda-gdb vector_add
NVIDIA (R) CUDA Debugger
4.0 release
Portions Copyright (C) 2007-2011 NVIDIA Corporation
GNU gdb 6.6
Copyright (C) 2006 Free Software Foundation, Inc.
[...]
(cuda-gdb) break vector_add_gpu
Breakpoint 1 at 0x400ddb: file vector_add.cu, line 7.
(cuda-gdb) run
[...]
(cuda-gdb) info cuda devices
  Dev Description SM Type SMs Warps/SM Lanes/Warp Max Regs/Lane Active SMs Mask
*   0       gt200   sm_13  30       32         32           128 0x00000001

I have checked that code build with -gencode arch=compute_20,code=sm_20 compiles without errors on said machine, and when compiled for sm_20 then using printf in CUDA kernel works correctly.

How can I make cuda-gdb see all devices (perhaps except one used for graphics... though in said case I am logging remotely via SSH), or at least one Tesla / sm_20 device?


When following advise in Michael Foukarakis response by setting CUDA_VISIBLE_DEVICES environment variable to contain only "0,1" i.e. make visible only Teslas, I get the following error after running info cuda devices:

(cuda-gdb) info cuda devices
fatal:  All CUDA devices are used for X11 and cannot be used while debugging. (error code = 24)

How to check which devices are used by X11 (X.Org), and how to make X Window System to use GeForce and not Tesla?

Community
  • 1
  • 1
Jakub Narębski
  • 309,089
  • 65
  • 217
  • 230
  • 2
    `info cuda devices` should only show the card or cards running kernels or with valid contexts held by the current debugging session. `info cuda system` should show whether all the cards are visible or not. Note that there isn't any guarantee of enumeration consistancy between the driver (so what nvidia-smi or cuda-gdb shows) and the API. The `CUDA_VISIBLE_DEVICES` mechanism and/or driver compute mode status is the best way to steer code onto the hardware you want. – talonmies Jan 11 '12 at 13:14
  • `cuda-gdb` version 4.0 (from CUDA SDK 4.0.17) does not have **`system`** info: `info cuda system` results in *"Unrecognized option: 'system'."* – Jakub Narębski Jan 11 '12 at 13:57

1 Answers1

2

Can you make sure the CUDA_VISIBLE_DEVICES environment variable contains all the devices you want to be used, such as:

$ ./deviceQuery -noprompt | egrep "^Device"
Device 0: "Tesla C2050"
Device 1: "Tesla C1060"
Device 2: "Quadro FX 3800"

By setting the variable you can make only a subset of them visible to the runtime:

$ export CUDA_VISIBLE_DEVICES="0,2"
$ ./deviceQuery -noprompt | egrep "^Device"
Device 0: "Tesla C2050"
Device 1: "Quadro FX 3800"
Michael Foukarakis
  • 39,737
  • 6
  • 87
  • 123
  • When I used `export CUDA_VISIBLE_DEVICES="0,1"` (i.e. only CUDA devices with capability 2.0 - Teslas) I get the following error when trying to run `info cuda devices`: **fatal: All CUDA devices are used for X11 and cannot be used while debugging. (error code = 24)**. So how to make X11 not use Teslas but Quadros for display? – Jakub Narębski Jan 11 '12 at 13:07
  • @JakubNarębski: The CUDA driver has the facility to limit or prohibit the use of any given card via the "compute mode" setting. You can use the SDK `deviceQuery` or `nvidia-smi` to check whether the Telsas are set to compute exclusive or compute prohibited. That could be the source of the problem. – talonmies Jan 11 '12 at 13:26
  • @talonmies: How to change "compute mode" setting, then (on Linux with X.Org)? – Jakub Narębski Jan 11 '12 at 13:50