0

What I'm trying to do is get an average of values by key via thrust::reduce_by_key. I first sort_by_key and that works just fine to group by consecutive keys for reduce_by_key. I used this to help get me this far. However, I am getting lots of errors that I cannot understand (this is also my first time using reduce_by_key) and I can't think of a better way to do this without using lots of temporary allocations to (1) get the sum of the values by key then the count by key and (2) divide the two for the average.

input keys:   1,   1,   1,  2,   3,   5,  5,  2
input values: 120, 477, 42, 106, 143, 53, 83, 24

expected output values: 213, 65, 143, 68

I have the following custom functor:

struct GetAverage
{
    template<typename Tuple>
    __host__ __device__
    int operator()(const Tuple& t)
    {
        //SumByKey / CountByKey
        return thrust::get<0>(t) / thrust::get<1>(t);
    }
};

The functor is called from the below code, located in main()

thrust::device_vector<unsigned int> tempKey(8);
thrust::device_vector<unsigned int> tempValue(8);

tempKey[0] = 1;
tempKey[1] = 1;
tempKey[2] = 1;
tempKey[3] = 2;
tempKey[4] = 3;
tempKey[5] = 5;
tempKey[6] = 5;
tempKey[7] = 2;

tempValue[0] = 120;
tempValue[1] = 477;
tempValue[2] = 42;
tempValue[3] = 106;
tempValue[4] = 143;
tempValue[5] = 53;
tempValue[6] = 83;
tempValue[7] = 24;

thrust::sort_by_key(tempKey.begin(), tempKey.end(), tempValue.begin());

thrust::equal_to<int> binary_pred;
thrust::reduce_by_key(
    tempKey.begin(),
    tempKey.end(),
    thrust::make_zip_iterator(
        thrust::make_tuple(
            tempValue.begin(),
            thrust::make_constant_iterator(1)
        )
    ), //values_first; Should go into GetAverage() custom functor as a zipped tuple <tempValue, 1>
    tempKey.begin(), //keys_output; Should be returning the unique keys
    tempValue.begin(), //values_output; Should be returning the average by key 
    binary_pred,
    GetAverage()
);

Example errors:
-no instance of function template "GetAverage::operator()" matches the argument list
-no operator "=" matches these operands
-no suitable conversion function from "InputValueType" to "TemporaryType" exists
-no suitable conversion function from "thrust::detail::tuple_of_iterator_references<thrust::device_reference<int>, int, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type>" to "TemporaryType" exists

Does anyone have any ideas on how to fix this? Or links? I read the documentation on everything in use here and was as careful in trying to understand it as possible, but with no resolution. Thanks!

UPDATE
See Eric's answer. In conjunction with what he said, this is the new source code. Created an op to handle plus for Tuples. The only thing this code doesn't do is after the reduce_by_key call, a thrust::transform should be used on the results to get the average by dividing the sum by the count.

// --- Defining key tuple type
typedef thrust::tuple<int, int> Tuple;

/* PLUS OPERATOR BETWEEN TUPLES */
struct TuplePlus
{
    __host__ __device__
    Tuple operator ()(const Tuple& lhs, const Tuple& rhs)
    {
        return thrust::make_tuple(
            thrust::get<0>(lhs) + thrust::get<0>(rhs),
            thrust::get<1>(lhs) + thrust::get<1>(rhs)
        );
    }
};

Inside main() I have the following now.

thrust::equal_to<int> binary_pred;
thrust::reduce_by_key(
    tempKey.begin(),
    tempKey.end(),
    thrust::make_zip_iterator(
        thrust::make_tuple(
            tempValue.begin(),
            thrust::make_constant_iterator(1)
        )
    ), //values_first; goes in as a zipped up tuple <value, 1>
    tempKey.begin(), //keys_output
    thrust::make_zip_iterator(
        thrust::make_tuple(
            tempValue.begin(),
            tempCount.begin()
        )
    ), //values_output; ZipIterator<Sum, Count> by key
    binary_pred,
    TuplePlus()
);
Community
  • 1
  • 1
aiwyn
  • 268
  • 2
  • 9

1 Answers1

1

There are two issues.

The result of a tuple sequence reduction should be a tuple but not int. According to the document

https://thrust.github.io/doc/group__reductions.html#ga633d78d4cb2650624ec354c9abd0c97f

The last parameter binary_op should be of the type

BinaryFunction is a model of Binary Function and BinaryFunction's result_type is convertible to OutputIterator2's value_type.

It means your reduction operation should be something like

struct GetSum
{
  template<typename Tuple>
  __host__ __device__
  Tuple operator()(const Tuple& a, construction Tuple& b)
  {
    ...
  }
}

On the other hand, at the reduction stage, you can only calculate the sum but not the average efficiently. It means your values_output should also be a zip iterator with the same type as values_first.

OutputIterator2 is a model of Output Iterator and InputIterator2's value_type is convertible to OutputIterator2's value_type.

So you need two result arrays, one for the sum by key and one for the count by key. They should be zipped together and used as values_output .

Then you need another thrust::transform to calculate the final result - average by key.


You could also try the approach proposed by @RobertCrovella, which uses a single thrust::reduce_by_key to calculate the average.

Output from reduce_by_key() as a function of two reduced vectors

Community
  • 1
  • 1
kangshiyin
  • 9,681
  • 1
  • 17
  • 29
  • Ah, yes this makes sense now. Seeing as how I'll be performing the average by key in a separate `thrust::transform`, I don't need a custom BinaryFunction anymore, correct? I should be able to just rely on the default plus op for each item in the tuple to be summed. Thanks again. – aiwyn Jun 17 '16 at 23:34
  • 1
    No. I think there's no plus op for tuple. – kangshiyin Jun 18 '16 at 01:48
  • You're correct. I had to create my own. Updating to my question in case it helps anyone else. – aiwyn Jun 18 '16 at 16:21
  • You can calculate an average via `thrust::reduce` using a method similar to what I have outlined [here](http://stackoverflow.com/questions/37149344/output-from-reduce-by-key-as-a-function-of-two-reduced-vectors/37150407#37150407) – Robert Crovella Jun 21 '16 at 23:25
  • @RobertCrovella It is a nice approach. But the over all speed may be slower, as it introduces a lot of unnecessary division operation. – kangshiyin Jun 22 '16 at 06:30
  • Your approach introduces additional thrust calls, i.e. kernel calls and associated overhead, not to mention additional read/write overhead, which a single thrust call can avoid via fusion of operations. I don't think you can judge which will be faster unless you benchmark for a specific case. Anyway, my response was largely to this statement in your answer: "On the other hand, averaging can not be a reduction operation", which I would argue is either incorrect, or misleading. My statement: "You can calculate an average via thrust::reduce" I would argue is correct. In thrust, it *is* possible. – Robert Crovella Jun 22 '16 at 13:30
  • @RobertCrovella I changed my answer a little bit. You are right that thrust::reduce can calculate average, but by definition, averaging is not a reduction operation. – kangshiyin Jun 22 '16 at 14:30
  • Averaging *is* a reduction operation if I know the size of the array apriori. One possible definition of a reduction operation is any operation that produces a single scalar quantity from an array of data. By that definition, it *is* a reduction. Furthermore I can calculate the average as a reduction, by dividing each element by the array size before I sum them together. And for a memory bound problem, the overhead of additional divide operations may not matter. You would really have to benchmark or do more careful analysis to claim one is better than the other. – Robert Crovella Jun 22 '16 at 14:34
  • @RobertCrovella This case with reduce_by_key happens to be the situation that we don't know the size of each part. I agree that benchmark is required. I'm not sure which way is faster. That's why I said "may be slower". – kangshiyin Jun 22 '16 at 14:48
  • Yes, my latest comment is now with respect to your edit " averaging by definition is not a reduction operation." – Robert Crovella Jun 22 '16 at 14:50
  • @RobertCrovella ok, I usually follow a stricter definition. – kangshiyin Jun 22 '16 at 15:26