Highest Voted 'gpu-cooperative-groups' Questions

4

votes

3 answers

Can I launch a cooperative kernel without passing an array of pointers?

The CUDA runtime API allows us to launch kernels using the variable-number-of-arguments triple-chevron syntax: my_kernel<<>>( first_arg, second_arg, and_as_many, as_we, want_to, etc, etc); but as regards…

asked Jan 31 '18 at 16:53

einpoklum

118,144
57
340
684

3

votes

1 answer

Should thread_block type be passed by reference?

Question When passing thread_group type objects to a device function, is there a preference for passing by reference vs passing by value? Is one of them "correct" What are the differences for each approach When should each approach be…

c++ cuda gpu-cooperative-groups

asked Jun 14 '23 at 21:49

John Mansell

624
5
16

2

votes

1 answer

The Unresolved extern function 'cudaCGGetIntrinsicHandle' strikes back

I'm building the examples for my cuda-api-wrappers repo. A while ago, I encountered a problem: Whenever I tried compiling a .cu file which code using the "cooperative groups" mechanism, I got an error message: ptxas fatal : Unresolved extern…

cuda linker-errors nvcc gpu-cooperative-groups

asked Dec 08 '19 at 19:20

einpoklum

118,144
57
340
684

2

votes

1 answer

Two consecutive kernels or whole-grid cooperative group synchronization?

Suppose I have two tasks to run on a GPU, the second of which relying on essentially all work by the first. Traditionally, I would essentially have to write these tasks as two separate kernels and schedule the second to run at some point after the…

performance cuda gpu-cooperative-groups

asked Jan 08 '19 at 00:14

einpoklum

118,144
57
340
684

2

votes

1 answer

cooperative_groups::this_grid() causes any CUDA API call to return 'unknown error'

Following the same steps in CUDA samples to launch a kernel and sync across the grid using cooperative_groups::this_grid().sync() causes any CUDA API call to fails. While using cooperative_groups::this_thread_block().sync() works fine and gives…

cmake cuda gpu gpu-cooperative-groups

asked Nov 27 '18 at 03:57

ahmed

59
5

2

votes

1 answer

CUDA Cooperative Groups : Linking error

After reading about Cooperative Groups in CUDA 9, I've been trying synchronize at a grid level. I'm using Visual Studio 2017, a GTX 1060 and CUDA 9.1. I altered my code as follows: __global__ void ExplicitKernel_American(/* ... */) { int i =…

cuda linker-errors nvcc gpu-cooperative-groups

asked Apr 05 '18 at 21:44

user3821901

95
6

1

vote

2 answers

Deadlocks with cuda cooperative groups

In the CUDA Programming Guide in the section about Cooperative Groups, there is an example of grid-local synchronization: grid_group grid = this_grid(); grid.sync(); Unfortunately, I didn't found precise definition of grid.sync() behavior. Is it…

cuda gpu-cooperative-groups

asked Nov 29 '19 at 19:07

user8044236

1

vote

1 answer

How can using cooperative groups feature of CUDA in windows

My GPU is GeForce MX150, pascal architecture, CC. 6.1, CUDA 9.1, windows 10. Although my GPU is pascal but cooperative groups doesn't work. I want to use it for inter-block synchronization. I found my tcc mode doesn't active. I also found that …

parallel-processing cuda nvidia gpu gpu-cooperative-groups

asked Dec 14 '18 at 13:36

pedram64

37
1
7

0

votes

1 answer

Memory allocation is not permitted when running kernel with cudaLaunchCooperativeKernel and -rdc=true

An error "operation not permitted" is generated when running the following code. Is there anything I am missing? I'm running it with compute capabilities 7.5 and the command nvcc test.cu -rdc=true. It works without RDC. #include…

c++ cuda gpu-cooperative-groups

asked Apr 06 '23 at 08:51

Pierre T.

380
1
13

0

votes

1 answer

In CUDA, how can I get this warp's thread mask in conditionally executed code (in order to execute e.g., __shfl_sync or .shfl?

I'm trying to update some older CUDA code (pre CUDA 9.0), and I'm having some difficulty updating usage of warp shuffles (e.g., __shfl). Basically the relevant part of the kernel might be something like this: int f = d[threadIdx.x]; int warpLeader =…

cuda gpu-warp gpu-cooperative-groups

asked Mar 22 '22 at 18:41

sg_man

763
1
6
14

0

votes

1 answer

How to run cuda cooperative template kernel

I am trying to unsuccessfully launch template kernel as cooperative kernel in CUDA C++ , what am I doing wrong error Error cannot determine which instance of function template "boolPrepareKernel" is intended I try to invoke kernel like…

c++ cuda gpu-cooperative-groups

asked Jan 26 '22 at 16:17

Jakub Mitura

159
1
14

0

votes

1 answer

error: class "cooperative_groups::__v1::thread_block" has no member "is_valid"

We are trying to use some of the cooperative groups features in CUDA to write a small application. We are using Tesla V100 card with CUDA 11.0. But on using is_valid() method in thread_block the below error is observed: error: class…

cuda gpu-cooperative-groups

asked Jan 28 '21 at 14:23

Satyanvesh D

323
1
4
16

0

votes

1 answer

Why is cudaLaunchCooperativeKernel() returning not permitted?

So I am using GTX 1050 with a compute capability of 6.1 with CUDA 11.0. I need to use grid synchronization in my program so cudaLaunchCooperativeKernel() is needed. I have checked my device query so the GPU does have support for cooperative groups.…

cuda dynamic-parallelism gpu-cooperative-groups

asked Dec 22 '20 at 09:23

abhishekpurandare1297

27
5

0

votes

0 answers

CUDA kernel stops working when using cooperative groups grid sync() function

I was trying to implement some cuda kernels leveraging the cooperative groups feature when i encountered that the kernel wouldn't do anything at all. I boiled it down to a small example which can be seen below. As soon as i call the sync() function…

c++ cuda nvcc gpu-cooperative-groups

asked Dec 08 '20 at 17:34

KoKlA

898
2
11
15

0

votes

0 answers

Why does nvcc refuse to link this simple cooperative-groups program?

Consider the following CUDA program, in a file named foo.cu: #include #include __global__ void my_kernel() { auto g = cooperative_groups::this_grid(); g.sync(); } int main(int, char **) { …

cuda linker linker-errors ptxas gpu-cooperative-groups

asked Dec 25 '19 at 18:38

einpoklum

118,144
57
340
684

Questions tagged [gpu-cooperative-groups]