0

I have 100000 arrays that are stored in a big 1D array. The size of each array is L =1000 and are ordered in descending order order.

Each array is divided in m=5 segments.(S1,S2,S3,S4,S5) For example :

A1=[S1,S2,S3,S4,S5]
A2=[S1,S2,S3,S4,S5]
A3=[S1,S2,S3,S4,S5]
…
A100000=[S1,S2,S3,S4,S5]

Here an example of the big container array is as follows: enter image description here

My question is :

For each window of w=10 arrays (for example) I want to reorganize the 10 arrays as follows: Put the segments S1 of each of these 10 arrays first, after that put S2 segments , S3 segments …. Hereunder an example with w =6 : enter image description here

As information L , m and w are parameters that may take on different values.

Is there a fast way to do it with Thrust ?

Robert Crovella
  • 143,785
  • 11
  • 213
  • 257
  • the methodology will be very similar to what is described [here](https://stackoverflow.com/questions/55885057/how-to-reduce-nonconsecutive-segments-of-numbers-in-array-with-thrust/55894275#55894275). You'll need to construct a map vector which will map input to output. Then pass that map vector, via a `thrust::permutation_iterator`, to `thrust::copy`. Note that sometimes good advice is to **not** copy data. If you are only using this "transformed view" of the data once, then just pass that permutation iterator to whatever thrust algorithm you are using once. – Robert Crovella Jun 29 '19 at 17:18
  • your data is laid out like a 2D array in linear memory. you don't need to do anything other than use a simple indexing scheme to access it as you describe. – talonmies Jun 29 '19 at 18:07
  • Thank you very much for you comment. I have see the example cited in your comment. it is a little bit difficult for me to adapt it for my case (since I'm not familiar with thrust and iterators). it is possible to make an example and put it as an answer? Thank you in advance dear @RobertCrovella – sara idrissi Jun 29 '19 at 19:49

1 Answers1

1

the methodology will be very similar to what is described here.

  1. construct a map vector (i.e. sequence produced by iterator) which will map input to output. We will use a thrust::counting_iterator to provide an ordinal sequence 0, 1, 2,..., and then construct the map iterator using a thrust::transform_iterator. The challenge is creating the correct arithmetic for the operation passed to the transform iterator. The code below is commented extensively to explain that.
  2. pass that map vector, via a thrust::permutation_iterator, to thrust::copy_n. Note that sometimes good advice is to not copy data. If you are only using this "transformed view" of the data once, then just pass that permutation iterator to whatever thrust algorithm you are using once, rather than actually copying the data. If you need to use the "transformed view" of the data many times, it may be more efficient to copy it once.

Here is a worked example:

$ cat t5.cu
#include <thrust/sequence.h>
#include <thrust/iterator/permutation_iterator.h>
#include <thrust/iterator/transform_iterator.h>
#include <thrust/iterator/discard_iterator.h>
#include <thrust/device_vector.h>
#include <thrust/copy.h>
#include <iostream>

typedef int dtype;  // data type
typedef int itype;  // indexing type - change to size_t if L*m*s > int
const itype s = 3;  // segment_width
const itype m = 2;  // number of segments per sub-array
const itype L = 4;  // number of sub-arrays per array
const itype w = 2;  // width of group (# of sub-arrays) for segment reordering
// S1 S1 S2 S2 S1 S1 S2 S2
//  0  1  2  6  7  8  3  4  5  9 10 11 12 13 14 18 19 20 15 16 17 21 22 23

using namespace thrust::placeholders;

int main(){
// we require data that consists of L*m*s elements
// we also require L to be whole-number divisible by w
  thrust::device_vector<dtype> d_data(L*m*s);
  thrust::device_vector<dtype> d_result(L*m*s);
  thrust::sequence(d_data.begin(), d_data.end());
// we will build up the necessary map iterator progressively
// we will start with an target_index sequence i =              0, 1, 2, 3, 4, 5, ... which defines the location in the target vector
// seg_idx    (position within a segment) = i%s                 0, 1, 2, 0, 1, 2, ...
// which_seg  (which segment within sub-array) = (i/(w*s))%m    0, 0, 0, 0, 0, 0, 1, ...
// which_sub  (which sub-array in group) = (i/s)%w              0, 0, 0, 1, 1, 1, 0, ...
// which_grp  (which group in array) = i/(w*s*m)                0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1 ...
// map index = which_grp*group_width + which_sub*subarray_width + which_seg*segment_width + seg_idx
// map index = (i/(w*s*m))*(w*s*m) + ((i/s)%w)*(s*m) + ((i/(w*s))%m)*s + i%s
  thrust::copy_n(thrust::make_permutation_iterator(d_data.begin(), thrust::make_transform_iterator(thrust::counting_iterator<itype>(0), (_1/(w*s*m))*(w*s*m) + ((_1/s)%w)*(s*m) + ((_1/(w*s))%m)*s + _1%s)), L*m*s, d_result.begin());
  thrust::copy(d_result.begin(), d_result.end(), std::ostream_iterator<dtype>(std::cout, ","));
  std::cout << std::endl;
}
$ nvcc -o t5 t5.cu
$ ./t5
0,1,2,6,7,8,3,4,5,9,10,11,12,13,14,18,19,20,15,16,17,21,22,23,
$

Note that if the overall data length (L*m*s) is greater than what can be held in a int quantity, then the above code would have to be refactored to use size_t instead of int for the itype typedef.

Robert Crovella
  • 143,785
  • 11
  • 213
  • 257