I have two group of arrays
a1 a2 a3 a4 a5 a6 a7 a8 <= name it as key1
b1 b2 b3 b4 b5 b6 b7 b8 <= val1
c1 c2 c3 c4 c5 c6 c7 c8
and
d1 d2 d3 d4 d5 d6 d7 d8 <= key2
e1 e2 e3 e4 e5 e6 e7 e8 <= val2
f1 f2 f3 f4 f5 f6 f7 f8
The arrays a1,...,an
and d1,...,dn
are sorted and might be repeated. i.e. their values might be something like 1 1 2 3 4 6 7 7 7 ...
I want to check if for each Tuple di,ei
check if it is equal to any of ai,bi
. If it is (di==ai,bi==ei
) then I have to combine fi
and ci
using some function e.g. add and store in fi.
Firstly, is it possible to do this using zip iterators and transformation in thurst library to solve this efficiently?
Secondly, the simplest method that I can imagine is to count occurance of number of each keys (ai)
do prefix sum and use both to get start and end index of each keys and then for each di use above counting to iterate through those indices and check if ei==di
. and perform the transformation.
i.e. If I have
1 1 2 3 5 6 7
2 3 4 5 2 4 6
2 4 5 6 7 8 5
as first array, I count the occurance of 1,2,3,4,5,6,7,...:
2 1 1 0 1 1 1 <=name it as count
and then do prefix sum to get:
2 3 4 4 5 6 7 <= name it as cumsum
and use this to do:
for each element di,
for i in (cumsum[di] -count[di]) to cumsum[di]:
if ei==val1[i] then performAddition;
What I fear is that since not all threads are equal, this will lead to warp divergence, and I may not have efficient performance.