0

What is the proper (or optimal) way to use some constant data in functors used in thrust algorithms like thrust::transform? The naive way I used was simply allocate required arrays inside the functor's operator() method, like this:

struct my_functor {

    __host__ __device__
    float operator()(thrust::tuple<float, float> args) {

        float A[2][10] = {
            { 4.0, 1.0, 8.0, 6.0, 3.0, 2.0, 5.0, 8.0, 6.0, 7.0 },
            { 4.0, 1.0, 8.0, 6.0, 7.0, 9.0, 5.0, 1.0, 2.0, 3.6 }};

        float x1 = thrust::get<0>(args);
        float x2 = thrust::get<1>(args);

        float result = 0.0;
        for (int i = 0; i < 10; ++i)
            result += x1 * A[0][i] + x2 * A[1][i];

        return result;
    }
}

But it seems not very elegant or efficient way. Now I have to develop relatively complicated functor with some matrices (constant, like in the example above) and additional methods used in the functor's operator() method. What is the optimal way to solve such a problem? Thanks.

lexxa2000
  • 9
  • 5
  • 1
    "optimal" in which way? – m.s. Jun 10 '15 at 07:27
  • 2
    Do you have a less trivial example of what you are trying do?. The code above could be hugely simplified, removing the loop and replacing the 20 constants in the array with 2 constants, reducing to `result = x1 * AA0 + x2 * AA1`... – talonmies Jun 10 '15 at 07:28
  • I've learnt several ways like: (1) use `device_vectors`s and permutation iterators (2) allocate memory for arrays in device memory in advance (but I don't know exactly how I should use this technique in my code...) and use `device_ptr` and other tools inside a functor (3) allocate arrays inside the `operator()`... Maybe there exist some other ways. By now I chose the 3rd one, but it's not clear to me how to share these arrays among several methods inside the functor... Probably I should make these arrays as fields of the struct, but the code won't compile – lexxa2000 Jun 10 '15 at 07:37
  • @talonmies yes, it could be simplified in a such way, you're right, of course ... but what if I can't hardcode these constants in advance and need to fill these arrays from outside the functor? How can I implement such arrays so that they could be shared among several private methods inside a functor? Thanks. – lexxa2000 Jun 10 '15 at 07:45
  • @lexxa2000 show the code which does not compile for you – m.s. Jun 10 '15 at 07:58
  • 1
    You can also pass *device pointers* (for functors operating on the device) as initializing parameters to functors, similar to what is shown in the answer given by @talonmies. This would allow you to fill/modify the arrays from outside the functor. And this allows for sharing, of course, as well. An example is in the answer [here](http://stackoverflow.com/questions/25217333/splicing-two-different-length-vectors-based-on-their-respective-index-vectors-co). – Robert Crovella Jun 10 '15 at 14:46
  • thanks @RobertCrovella – lexxa2000 Jun 11 '15 at 10:25

1 Answers1

2

From your last comment, it is clear that what you are really asking about here is functor parameter initialisation. CUDA uses the C++ object model, so structures have class semantics and behaviour. So your example functor

struct my_functor {
    __host__ __device__
    float operator()(thrust::tuple<float, float> args) const {
        float A[2] = {50., 55.6};

        float x1 = thrust::get<0>(args);
        float x2 = thrust::get<1>(args);

        return x1 * A[0]+ x2 * A[1];
    }
}

can be re-written with an empty constructor with intialisation lists to transform hardcoded constants within the functor into runtime assignable values:

struct my_functor {
    float A0, A1;

    __host__ __device__
    my_functor(float _a0, _a1) : A0(_a0), A1(_a1) { }

    __host__ __device__
    float operator()(thrust::tuple<float, float> args) const {
        float x1 = thrust::get<0>(args);
        float x2 = thrust::get<1>(args);

        return x1 * A0 + x2 * A1;
    }
}

You can instantiate as many versions of the functor, each with different constant values, to do whatever it is you are using the functors for in conjunction with the thrust library.

talonmies
  • 70,661
  • 34
  • 192
  • 269
  • this method works, I can't give more complicated code yet, maybe when I have it the problem will become more clear. Thanks. – lexxa2000 Jun 10 '15 at 10:34