19

I have a problem that I would like to merge a large number of images using ImageMagick's convert.exe, but under Windows I have a 8192 byte long command line limit.

My solution to this is to split the task into smaller sub-task, run them, and do a final task which combines them together.

My idea is to write a function, which takes a vector of images and an integer, and splits the vector into n sub-vector all having "almost equal" parts.

So for example if I would like to split 11 into 3 groups it would be 4-4-3.

Can you tell me how can I do it in C++? I mean, to write a function

split_vec( const vector<image> &images, int split )

which does the splitting?

Also, can you tell me what is the most efficient way to do if I don't need to create new vectors, just iterate through the sub-parts? Like the std::substr function with std::string?

Note: I already use Boost in the project, so if there is some nice tool in Boost for this then it's perfect for me.

hyperknot
  • 13,454
  • 24
  • 98
  • 153

8 Answers8

14

To get a base number for the size of each part, simply divide the total by the number of parts: 11/3 = 3. Obviously some of the parts will need to be bigger than that to get the proper total, but that's just the remainder: 11 % 3 = 2. So now you know that 2 of the parts will be size 3+1, and whatever's left over will be 3.

Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
  • Thanks, here is what I come up with: double loop = number / parts; for( int i = 0; i < parts; i++ ) { int start = i * loop; int end = ( i + 1 ) * loop - 1; } – hyperknot Jul 28 '11 at 17:20
  • 1
    @zsero, if both `number` and `parts` are integers you'll need to convert one to double before doing the division. Also you'll need to worry about roundoff error, there are cases where you might get an off-by-one error when you convert back to integer. – Mark Ransom Jul 28 '11 at 17:26
  • Actually I use doubles in the function definition, and a round() function for start and end. Do you think it is possible to have roundoff error when using round() function? (I use stringstream to round) – hyperknot Jul 28 '11 at 18:43
  • @zsero, if you're using rounding instead of truncation for start and end you should be OK. You left that part off of your previous comment. – Mark Ransom Jul 28 '11 at 21:11
7

Here is my solution:

template<typename T>
std::vector<std::vector<T>> SplitVector(const std::vector<T>& vec, size_t n)
{
    std::vector<std::vector<T>> outVec;

    size_t length = vec.size() / n;
    size_t remain = vec.size() % n;

    size_t begin = 0;
    size_t end = 0;

    for (size_t i = 0; i < std::min(n, vec.size()); ++i)
    {
        end += (remain > 0) ? (length + !!(remain--)) : length;

        outVec.push_back(std::vector<T>(vec.begin() + begin, vec.begin() + end));

        begin = end;
    }

    return outVec;
}
Yury
  • 1,169
  • 2
  • 16
  • 29
1

You could create a template that returns a std::vector < std::vector > and receives the vector you want split, and the number of divisions. using for and iterator is very easy.

#include <iostream>
#include <iomanip>
#include <vector>
#include <algorithm>
#include <numeric>

template<typename T>
std::vector< std::vector<T> > split(std::vector<T> vec, uint64_t n) {
  std::vector< std::vector<T> > vec_of_vecs(n);

  uint64_t quotient = vec.size() / n;
  uint64_t reminder = vec.size() % n;
  uint64_t first = 0;
  uint64_t last;
  for (uint64_t i = 0; i < n; ++i) {
    if (i < reminder) {
      last = first + quotient + 1;
      vec_of_vecs[i] = std::vector<T>(vec.begin() + first, vec.begin() + last);
      first = last;
  }
    else if (i != n - 1) {
    last = first +  quotient;
    vec_of_vecs[i] = std::vector<T>(vec.begin() + first, vec.begin() + last);
    first = last;
  }
    else
    vec_of_vecs[i] = std::vector<T>(vec.begin() + first, vec.end());
}

return vec_of_vecs;
}

#define ONE_DIMENSION 11
#define SPLITS 3

int main(void)
{
  std::vector<uint64_t> vector(ONE_DIMENSION);
  std::iota(std::begin(vector), std::end(vector), 1);

  std::vector<std::vector<uint64_t>> vecs(SPLITS);
  vecs = split(vector, SPLITS);

  for (uint64_t m = 0; m < vecs.size(); ++m) {
    for (auto i : vecs[m])
      std::cout << std::setw(3) << i << " ";
    std::cout << std::endl;
  }


  return 0;
}
Moises Rojo
  • 371
  • 2
  • 14
1

CreateProcess has a 32kb limit

Or, if you want to go via the shell,

vec::const_iterator i = vec .begin ();
vec::const_iterator j = i + stride;

while (j < vec .end ()) {
    do_range (i, j);
    i = j;
    j += stride;
}

do_range (i, vec .end ());
spraff
  • 32,570
  • 22
  • 121
  • 229
1

Have you thought about using the xargs program. This maybe a high-level solution to the problem.

Mike
  • 1,760
  • 2
  • 18
  • 33
  • 2
    I use "unix" utilities on my Windows machines all the time. checkout: unxutils.sf.net and/or www.cygwin.com – Mike Jul 28 '11 at 15:18
  • Thanks for the tip, although I'm afraid this won't help him run the code on *someone else's* computer :-P – spraff Jul 28 '11 at 15:25
  • Why? `xargs` is a stand-alone program. Distribute it with his program. – Mike Jul 28 '11 at 15:35
1

You don't have to create new sub-vectors, use something like following:

size_t ProcessSubVec(const vector<Image>& images, size_t begin, size_t end)
{
    // your processing logic
}

void SplitVec(const vector<Image>& images, int cnt)
{
    size_t SubVecLen = images.size() / cnt,
           LeftOvers = images.size() % cnt,
           i = 0;

    // Split into "cnt" partitions
    while(i < images.size())
        i += ProcessSubVec(images, i, i + SubVecLen + (LeftOvers-- == 0 ? 0 : 1));
}

Hope this helps.

BrandonSun
  • 129
  • 3
1

This is how I do it (I know it's very similar to the answer but that was my actual code lol):

template<typename T>
std::vector<std::vector<T>> splitVector(const std::vector<T>& vec, size_t n)
{
    std::vector<std::vector<T>> out_vec;
    size_t length = vec.size() / n;
    size_t remain = vec.size() % n;
    size_t begin = 0;
    size_t end = 0;

    for (size_t i = 0; i < n; ++i)
    {
        end += length + (i < remain);
        out_vec.emplace_back(vec.begin() + begin, vec.begin() + end);
        begin = end;
    }

    return out_vec;
}

You could also return pair of iterators or such if you don't like copying.

G.Azma
  • 37
  • 9
0

You can use iterators to iterate through the sub-parts of the problem. Iterators usage is similar to pointers to elements of the vector

What you want to on the images do could be implemented as a function

using namespace std; 
void do_some_work(vector<image>::iterator begin, vector<image>::iterator end) {
    vector<image>::iterator i = begin ;
    while(i != end) {
        // do something using *i , which will be of type image
        ++i ;
    }
}
Louen
  • 3,617
  • 1
  • 29
  • 49