NVLink is a powerful tool which provides automatic vertex shader generation.
Questions tagged [nvlink]
10 questions
4
votes
1 answer
N-body OpenCL code : error CL_OUT_OF_HOST_MEMORY with GPU card NVIDIA A6000
I would like to make run an old N-body which uses OpenCL.
I have 2 cards NVIDIA A6000 with NVLink, a component which binds from an hardware (and maybe software ?) point of view these 2 GPU cards.
But at the execution, I get the following…
user1773603
2
votes
1 answer
Is there a way to use "unified memory" (MAGMA) with 2 GPU cards with NVLink and 1TB RAM
At work, On Debian 10, I have 2 GPU cards RTX A6000 with NVlink harware component with 1TB of RAM and I would like to benefit of the potential combined power of both cards and 1TB RAM.
Currently, I have the following magma.make invoked by a Makefile…
user1773603
2
votes
1 answer
Does NVLink accelerate training with DistributedDataParallel?
Nvidia's NVLink accelerates data transfer between several GPUs on the same machine.
I train large models on such a machine using PyTorch.
I see why NVLink would make model-parallel training faster, since one pass through a model will involve several…

acl
- 254
- 1
- 5
- 13
1
vote
0 answers
Don't see any transfers on NVLINK with NCCL all_sum test
With the following code (uses tensorflow.contrib.nccl.all_sum), I expected to see bytes being transferred over NVLINK.
In reality, I don't.
from tensorflow.contrib.nccl import all_sum
with tf.device('/gpu:0'):
a = tf.get_variable(
…

auro
- 1,079
- 1
- 10
- 22
1
vote
2 answers
Why is nvlink warning me about lack of sm_20 (compute capability 2.0) object code?
I'm working with CUDA 6.5 on a machine with a GTX Titan card (compute capability 3.5). I'm building my code with just -gencode=arch=compute_30,code=sm_30 -gencode=arch=compute_35,code=sm_35 - and when I link my binary, nvlink says:
nvlink warning :…

einpoklum
- 118,144
- 57
- 340
- 684
0
votes
0 answers
Enable NCCL in Custom TensorFlow Build
I am building TF 2.5.2 from source, how can I enable NCCL during the build? I have NCCL built, as well as having cuDNN and CUDA 8 (i.e., I am on the build config phase). --config=nccl doesn't work. I know that --config=nonccl is a default, so how do…

BHezy
- 1
0
votes
1 answer
How to specify the Nvlink type when using NCCL
In the DGX-1 system (8xV100), there are two types of NVlinks: NVlink-V1 and NVlink-V2,
is there any way for us to explicitly specify which types of NVlink we use for p2p and collective communication?

Daniel
- 325
- 3
- 10
0
votes
1 answer
OpenACC nvlink undefined reference to class
I am new to OpenACC and I am writing a new program from scratch (I have a fairly good idea what loops will be computationally costly from working in a similar problem before). I am getting an "Undefined reference" from nvlink. From my research, I…

WilhelmM
- 153
- 1
- 4
0
votes
1 answer
Can nvlink inline device functions from separate compilation units?
If the separate compilation units that are fed as input to nvlink contain cuda kernels and device functions that invoke device functions marked as __forceinline__, will these functions be inlined? Assume they would be inlined if one put all the…

user1823664
- 1,071
- 9
- 16
0
votes
1 answer
Odd behavior of cudaMemcpyAsync: 1. cudaMemcpyKind makes no difference. 2. Copy fails, but silently
I'm familiarizing myself with a new cluster equipped with Pascal P100 GPUs+Nvlink. I wrote a ping-pong program to test gpu<->gpu and gpu<->cpu bandwidths and peer-to-peer access. (I'm aware the cuda samples contain such a program, but I wanted to…

Michael Carilli
- 371
- 2
- 12