3

I have a code that generates all of the possible combinations of 4 integers between 0 and 36.

This will be 37^4 numbers = 1874161.

My code is written in MATLAB:

i=0;
for a = 0:36
    for b= 0:36
        for c = 0:36
            for d = 0:36
                i=i+1;
                combination(i,:) = [a,b,c,d];             
            end          
        end
    end
end

I've tested this with using the number 3 instead of the number 36 and it worked fine.

If there are 1874161 combinations, and with An overly cautions guess of 100 clock cycles to do the additions and write the values, then if I have a 2.3GHz PC, this is:

1874161 * (1/2300000000) * 100 = 0.08148526086

A fraction of a second. But It has been running for about half an hour so far.

I did receive a warning that combination changes size every loop iteration, consider predefining its size for speed, but this can't effect it that much can it?

Blue7
  • 1,750
  • 4
  • 31
  • 55
  • 2
    Usually it's a good idea to pay attention to such suggestions unless you know what you're doing. Did you try pre-allocating `combination`? – horchler May 22 '15 at 04:59
  • I wasn't certain of the size it was going to be, so I didn't. I don't want to stop and start again now though. It could be done any minute. – Blue7 May 22 '15 at 05:19
  • Wow. I just did it. Reduced the time from over an hour to less than a couple of seconds. – Blue7 May 22 '15 at 05:27
  • 1
    Here's another approach (not so relevant here, but maybe useful for you to know. Let `num=5;maxN=3;` (`num` is the number of integers and `maxN` is the highest integer to include) then `comb=double(dec2base(0:((maxN+1)^(num)-1),maxN+1,num))-48;comb(comb>10)=comb(comb>10)-7;` is nice. However it only works for `maxN` less than or equal to 35. – David May 22 '15 at 05:33
  • That seems like a much nicer way for in the future. Thanks. However, in this example I need as high a value for `maxN` as I can. With the nested for loops I've just managed 120 in a pretty speedy time, however the file size is huge. I can reduce it by half by predefining the integers as `int32` (the minimum size needed to store 121^4), but is there any way of reducing it further? – Blue7 May 22 '15 at 06:08
  • possible duplicate of [matlab - consider preallocating for speed](http://stackoverflow.com/questions/2151636/matlab-consider-preallocating-for-speed) – Franz Wurst May 22 '15 at 07:53
  • Another thing: do not use i and j as indices on matlab. They are the imaginary unit. – Ander Biguri May 22 '15 at 08:24
  • 2
    You can do it without loops using [this approach](http://stackoverflow.com/questions/21895335/generate-a-matrix-containing-all-combinations-of-elements-taken-from-n-vectors). It takes 0.25 seconds in my old computer; 0.04 seconds if you use `uint8` values – Luis Mendo May 22 '15 at 10:13

1 Answers1

7

As @horchler suggested you need to preallocate the target array

This is because your program is not O(N^4) without preallocation. Each time you add new line to array it need to be resized, so new bigger array is created (as matlab do not know how big array it will be it probably increase only by 1 item) and then old array is copied into it and lastly old array is deleted. So when you have 10 items in array and adding 11th, then a copying of 10 items is added to iteration ... if I am not mistaken that leads to something like O(N^12) which is massively more huge

  • estimated as (N^4)*(1+2+3+...+N^4)=((N^4)^3)/2

Also the reallocation process is increasing in size breaching CACHE barriers slowing down even more with increasing i above each CACHE size barrier.

The only solution to this without preallocation is to store the result in linked list

Not sure Matlab has this option but that will need one/two pointer per item (32/64 bit value) which renders your array 2+ times bigger.

If you need even more speed then there are ways (probably not for Matlab):

  1. use multi-threading for array filling is fully parallelisable
  2. use memory block copy (rep movsd) or DMA the data is periodically repeating
  3. You can also consider to compute the value from i on the run instead of remember the whole array, depending on the usage it can be faster in some cases...
Community
  • 1
  • 1
Spektre
  • 49,595
  • 11
  • 110
  • 380