Questions tagged [thrust]

Thrust is a template library of parallel algorithms with an interface resembling the C++ Standard Template Library (STL) for NVIDIA CUDA.

Thrust is a library of parallel algorithms with an interface resembling the C++ Standard Template Library (STL) which interoperates with technologies such as CUDA, OpenMP, and TBB. Thrust provides a flexible high-level interface for parallel programming designed to greatly simplify running common data parallel processing tasks on GPUs and multicore CPUs, with the intention of enhancing developer productivity.

As of CUDA release 4.0, a Thrust snapshot is included in every release of the the standard CUDA toolkit distribution. Under Debian and Ubuntu Thrust may also be installed through apt-get:

sudo apt-get install libthrust-dev

The project homepage contains documentation and sample code demonstrating usage of the library. The latest source, bug-reports and discussions are always available on GitHub.

The rocThrust project (GitHub) by AMD is a port of Thrust to their HIP/ROCm platform.

959 questions
47
votes
4 answers

Thrust inside user written kernels

I am a newbie to Thrust. I see that all Thrust presentations and examples only show host code. I would like to know if I can pass a device_vector to my own kernel? How? If yes, what are the operations permitted on it inside kernel/device code?
Ashwin Nanjappa
  • 76,204
  • 83
  • 211
  • 292
42
votes
1 answer

Differences between VexCL, Thrust, and Boost.Compute

With a just a cursory understanding of these libraries, they look to be very similar. I know that VexCL and Boost.Compute use OpenCl as a backend (although the v1.0 release VexCL also supports CUDA as a backend) and Thrust uses CUDA. Aside from the…
Sean Lynch
  • 2,852
  • 4
  • 32
  • 46
25
votes
1 answer

how to cast thrust::device_vector to raw pointer

I have a thrust device_vector. I want to cast it to a raw pointer so that I can pass it to a kernel. How can I do so? thrust::device_vector dv(10); //CAST TO RAW kernel<<>>(pass raw)
Programmer
  • 6,565
  • 25
  • 78
  • 125
23
votes
1 answer

Efficiency of CUDA vector types (float2, float3, float4)

I'm trying to understand the integrate_functor in particles_kernel.cu from CUDA examples: struct integrate_functor { float deltaTime; //constructor for functor //... template __device__ void…
ilciavo
  • 3,069
  • 7
  • 26
  • 40
22
votes
3 answers

From thrust::device_vector to raw pointer and back?

I understand how to go from a vector to a raw pointer but im skipping a beat on how to go backwards. // our host vector thrust::host_vector hVec; // pretend we put data in it here // get a device_vector thrust::device_vector dVec =…
madmaze
  • 878
  • 3
  • 11
  • 23
17
votes
2 answers

Finding the maximum element value AND its position using CUDA Thrust

How do I get not only the value but also the position of the maximum (minimum) element (res.val and res.pos)? thrust::host_vector h_vec(100); thrust::generate(h_vec.begin(), h_vec.end(), rand); thrust::device_vector d_vec = h_vec; T…
rych
  • 682
  • 3
  • 10
  • 22
14
votes
1 answer

Max number of threads which can be initiated in a single CUDA kernel

I am confused about the maximum number of threads which can be launched in a Fermi GPU. My GTX 570 device query says the following. Maximum number of threads per block: 1024 Maximum sizes of each dimension of a block: 1024 x 1024 x…
smilingbuddha
  • 14,334
  • 33
  • 112
  • 189
13
votes
1 answer

Thrust: How to create device_vector from host array?

I get some data from a library on the host as a pointer to an array. How do I create a device_vector that holds this data on the device? int* data; int num; get_data_from_library( &data, &num ); thrust::device_vector< int > iVec; // How to…
Ashwin Nanjappa
  • 76,204
  • 83
  • 211
  • 292
11
votes
2 answers

large integer addition with CUDA

I've been developing a cryptographic algorithm on the GPU and currently stuck with an algorithm to perform large integer addition. Large integers are represented in a usual way as a bunch of 32-bit words. For example, we can use one thread to add…
user1545642
10
votes
3 answers

Resolving Thrust/CUDA warnings "Cannot tell what pointer points to..."

I'm trying to build a trivial application using Thrust/CUDA 4.0 and get lots of warnings "warning : Cannot tell what pointer points to, assuming global memory space" Has anyone else seen this and how do I either disable them or fix my…
Ade Miller
  • 13,575
  • 1
  • 42
  • 75
10
votes
1 answer

thrust::device_vector in constant memory

I have a float array that needs to be referenced many times on the device, so I believe the best place to store it is in __ constant __ memory (using this reference). The array (or vector) will need to be written once at run-time when initializing,…
user2462730
  • 171
  • 1
  • 10
10
votes
1 answer

Performing Fourier Transform with Thrust

Thrust is an amazing wrapper for starting programming CUDA. I wonder there is any thing to encapsulate NVIDIA CUFFT with thrust or we need to implement ourselves?
9
votes
1 answer

How to avoid default construction of elements in thrust::device_vector?

It seems when creating a new Thrust vector all elements are 0 by default - I just want to confirm that this will always be the case. If so, is there also a way to bypass the constructor responsible for this behavior for additional speed (since for…
mchen
  • 9,808
  • 17
  • 72
  • 125
9
votes
3 answers

How to remove zero values from an array in parallel

How can I efficiently remove zero values from an array in parallel using CUDA. The information about the number of zero values is available in advance, which should simplify this task. It is important that the numbers remain ordered as in the…
diver_182
  • 261
  • 6
  • 13
8
votes
1 answer

is there a better and a faster way to copy from CPU memory to GPU using thrust?

Recently I have been using thrust a lot. I have noticed that in order to use thrust, one must always copy the data from the cpu memory to the gpu memory. Let's see the following example : int foo(int *foo) { host_vector m(foo, foo+…
igal k
  • 1,883
  • 2
  • 28
  • 57
1
2 3
63 64