It seems when creating a new Thrust vector all elements are 0 by default - I just want to confirm that this will always be the case.
If so, is there also a way to bypass the constructor responsible for this behavior for additional speed (since for some vectors I don't need them to have an initial value, e.g. if their raw pointers are being passed to CUBLAS as an output)?
Asked
Active
Viewed 967 times
9

Jared Hoberock
- 11,118
- 3
- 40
- 76

mchen
- 9,808
- 17
- 72
- 125
1 Answers
8
thrust::device_vector
constructs the elements it contains using its supplied allocator, just like std::vector
. It's possible to control what the allocator does when the vector asks it to construct an element.
Use a custom allocator to avoid default-initialization of vector elements:
// uninitialized_allocator is an allocator which
// derives from device_allocator and which has a
// no-op construct member function
template<typename T>
struct uninitialized_allocator
: thrust::device_malloc_allocator<T>
{
// note that construct is annotated as
// a __host__ __device__ function
__host__ __device__
void construct(T *p)
{
// no-op
}
};
// to make a device_vector which does not initialize its elements,
// use uninitialized_allocator as the 2nd template parameter
typedef thrust::device_vector<float, uninitialized_allocator<float> > uninitialized_vector;
You will still incur the cost of a kernel launch to invoke uninitialized_allocator::construct
, but that kernel will be a no-op which will retire quickly. What you're really interested in is avoiding the memory bandwidth required to fill the array, which this solution does.
There's a complete example code here.
Note that this technique requires Thrust 1.7 or better.

Jared Hoberock
- 11,118
- 3
- 40
- 76
-
Very nice. Despite having written debug allocator overloads for stl before, I forgot the final construct call was here. Should have kept digging. +1 :) – leander May 06 '13 at 12:01
-
Actually -- I'm confused. Am I wrong in that the `resize` in your example chains to `insert`, which in turn chains to `fill_insert`, which ends up in `uninitialized_fill_n` again? So you're still getting copies, despite possibly ignoring the `construct` when setting up the new `storage_type` area there? ...obviously I need to go step through this in a debugger, but I'm not seeing how the eventual `uninitialized_fill_n` from the default/value initialized `x` default arg can be avoided. – leander May 06 '13 at 23:05
-
You may need to step through in a debugger with the most recent Thrust. It's a complicated dispatch. – Jared Hoberock May 06 '13 at 23:14
-
Got it, sorry. For some reason I was simply missing the one-arg version of resize() in vector_base.inl. Sorry for the bother! For anyone watching along, the stack is: (1) detail::contiguous_storage::contiguous_storage(unsigned int n, const uninitialized_allocator
& alloc) (2) detail::vector_base::append(unsigned int n) (3) detail::vector_base::resize(unsigned int new_size) – leander May 06 '13 at 23:42