Repeat copies of array elements: Run-length decoding in MATLAB

Question

I'm trying to insert multiple values into an array using a 'values' array and a 'counter' array. For example, if:

a=[1,3,2,5]
b=[2,2,1,3]

I want the output of some function

c=somefunction(a,b)

to be

c=[1,1,3,3,2,5,5,5]

Where a(1) recurs b(1) number of times, a(2) recurs b(2) times, etc...

Is there a built-in function in MATLAB that does this? I'd like to avoid using a for loop if possible. I've tried variations of 'repmat()' and 'kron()' to no avail.

This is basically Run-length encoding.

score 21 · Accepted Answer · edited Aug 28 '18 at 16:35

Problem Statement

We have an array of values, vals and runlengths, runlens:

vals     = [1,3,2,5]
runlens  = [2,2,1,3]

We are needed to repeat each element in vals times each corresponding element in runlens. Thus, the final output would be:

output = [1,1,3,3,2,5,5,5]

Prospective Approach

One of the fastest tools with MATLAB is cumsum and is very useful when dealing with vectorizing problems that work on irregular patterns. In the stated problem, the irregularity comes with the different elements in runlens.

Now, to exploit cumsum, we need to do two things here: Initialize an array of zeros and place "appropriate" values at "key" positions over the zeros array, such that after "cumsum" is applied, we would end up with a final array of repeated vals of runlens times.

Steps: Let's number the above mentioned steps to give the prospective approach an easier perspective:

1) Initialize zeros array: What must be the length? Since we are repeating runlens times, the length of the zeros array must be the summation of all runlens.

2) Find key positions/indices: Now these key positions are places along the zeros array where each element from vals start to repeat. Thus, for runlens = [2,2,1,3], the key positions mapped onto the zeros array would be:

[X 0 X 0 X X 0 0] % where X's are those key positions.

3) Find appropriate values: The final nail to be hammered before using cumsum would be to put "appropriate" values into those key positions. Now, since we would be doing cumsum soon after, if you think closely, you would need a differentiated version of values with diff, so that cumsum on those would bring back our values. Since these differentiated values would be placed on a zeros array at places separated by the runlens distances, after using cumsum we would have each vals element repeated runlens times as the final output.

Solution Code

Here's the implementation stitching up all the above mentioned steps -

% Calculate cumsumed values of runLengths. 
% We would need this to initialize zeros array and find key positions later on.
clens = cumsum(runlens)

% Initalize zeros array
array = zeros(1,(clens(end)))

% Find key positions/indices
key_pos = [1 clens(1:end-1)+1]

% Find appropriate values
app_vals = diff([0 vals])

% Map app_values at key_pos on array
array(pos) = app_vals

% cumsum array for final output
output = cumsum(array)

Pre-allocation Hack

As could be seen that the above listed code uses pre-allocation with zeros. Now, according to this UNDOCUMENTED MATLAB blog on faster pre-allocation, one can achieve much faster pre-allocation with -

array(clens(end)) = 0; % instead of array = zeros(1,(clens(end)))

Wrapping up: Function Code

To wrap up everything, we would have a compact function code to achieve this run-length decoding like so -

function out = rle_cumsum_diff(vals,runlens)
clens = cumsum(runlens);
idx(clens(end))=0;
idx([1 clens(1:end-1)+1]) = diff([0 vals]);
out = cumsum(idx);
return;

Benchmarking

Benchmarking Code

Listed next is the benchmarking code to compare runtimes and speedups for the stated cumsum+diff approach in this post over the other cumsum-only based approach on MATLAB 2014B-

datasizes = [reshape(linspace(10,70,4).'*10.^(0:4),1,[]) 10^6 2*10^6]; %
fcns = {'rld_cumsum','rld_cumsum_diff'}; % approaches to be benchmarked

for k1 = 1:numel(datasizes)
    n = datasizes(k1); % Create random inputs
    vals = randi(200,1,n);
    runs = [5000 randi(200,1,n-1)]; % 5000 acts as an aberration
    for k2 = 1:numel(fcns) % Time approaches  
        tsec(k2,k1) = timeit(@() feval(fcns{k2}, vals,runs), 1);
    end
end

figure,      % Plot runtimes
loglog(datasizes,tsec(1,:),'-bo'), hold on
loglog(datasizes,tsec(2,:),'-k+')
set(gca,'xgrid','on'),set(gca,'ygrid','on'),
xlabel('Datasize ->'), ylabel('Runtimes (s)')
legend(upper(strrep(fcns,'_',' '))),title('Runtime Plot')

figure,      % Plot speedups
semilogx(datasizes,tsec(1,:)./tsec(2,:),'-rx')        
set(gca,'ygrid','on'), xlabel('Datasize ->')
legend('Speedup(x) with cumsum+diff over cumsum-only'),title('Speedup Plot')

Associated function code for rld_cumsum.m:

function out = rld_cumsum(vals,runlens)
index = zeros(1,sum(runlens));
index([1 cumsum(runlens(1:end-1))+1]) = 1;
out = vals(cumsum(index));
return;

Runtime and Speedup Plots

Conclusions

The proposed approach seems to be giving us a noticeable speedup over the cumsum-only approach, which is about 3x!

Why is this new cumsum+diff based approach better than the previous cumsum-only approach?

Well, the essence of the reason lies at the final step of the cumsum-only approach that needs to map the "cumsumed" values into vals. In the new cumsum+diff based approach, we are doing diff(vals) instead for which MATLAB is processing only n elements (where n is the number of runLengths) as compared to the mapping of sum(runLengths) number of elements for the cumsum-only approach and this number must be many times more than n and therefore the noticeable speedup with this new approach!

Thank you for the excellent answer. I think it may be one of the most thorough and well explained I've ever gotten on SO. — Doresoom, Mar 16 '15 at 15:21
Nice optimization! Out of curiosity, where does the built-in MATLAB function fit on the curve? It might be useful to add it as a baseline. — chappjc, Mar 16 '15 at 15:37
@chappjc Thanks! Well that's the issue really, don't have access to `MATLAB 2015A`! Looking for volunteers on that, as stated in the bounty comment! :) — Divakar, Mar 16 '15 at 15:46
@Doresoom Glad to be of help on this! Seems like run-length decoding is quite useful and this `cumsum+diff` trick struck me recently! — Divakar, Mar 16 '15 at 15:49
@Divakar Oh, I see. Didn't read that. :) If a get a chance, I'll run it today. But I can say now that it's not M-code backed, pure built-in or Mex. — chappjc, Mar 16 '15 at 16:20
@chappjc Would be interesting to see how it performs against these cumsum's ones then! I think a CW answer would be suited for the benchmarks. — Divakar, Mar 16 '15 at 16:24
`rle_cumsum_diff` is much faster than MATLAB's `repelem`. Well done! Also, `repelem` has trouble with large data sizes, and gets much slower than even `rld_cumsum`. I'll put the graphs in a CW shortly. — chappjc, Mar 16 '15 at 17:38
@chappjc Very kool! I made a slight typo there, the function must be named as `rld_cumsum_diff` instead of `rle_cumsum_diff`, because the plots would get the legends accordingly. RLD for run-length decoding that is. — Divakar, Mar 16 '15 at 17:42
@LuisMendo Yeah, this "optimization" wasn't possible with the earlier generalised run-length decoding we did sometime ago covering cell arrays, because the `diff` won't work on them! — Divakar, Mar 16 '15 at 21:56
@Divakar Yes, I immediately remembered that question. I've suggested chappjc to add to the benchmarking a stripped-down version of the fastest answer there. On my R2014b it's slower, though. In fact, your approach looks hard to beat! — Luis Mendo, Mar 16 '15 at 21:58
@LuisMendo In fact, I would love to see more approaches being added in this CW answer, but without those cell arrays considerations and zero runlengths, just to keep their essence! — Divakar, Mar 16 '15 at 21:59
Interesting idea. Even beats the new builtin! It has its downsides though, as it won't work correctly with integer types e.g. `vals = int8([120,-120])` or doubles with large variance `vals = [1e16, 1]` or `Inf`s/`Nan`s. — knedlsepp, Mar 16 '15 at 22:43
@knedlsepp Thanks! Yup because of `diff()`, good find on corner cases! — Divakar, Mar 17 '15 at 03:37
Thanks @Divakar. It doesn't seem to support zeros in runlens though, e.g. `rld_cumsum(1:4, [0 0 3 0]) = [1 1 1 2]` instead of the expected `[3 3 3]`. Bit of an edge case, but thought I'd let you know. — Paul Calcraft, Apr 08 '16 at 11:03
@PaulCalcraft Thanks! Yeah I gotta fix that. Appreciate you testing out! — Divakar, Apr 08 '16 at 11:11
One of the last lines under the heading *"Solution Code"* is: `array(pos)=app_vals;`, It should be: `array(key_pos)=app_vals;` — Sardar Usama, Apr 27 '17 at 16:23
@SardarUsama Yup and that handling zero repeat cases are to be updated for. Thanks for pointing it out! Would update soon, hopefully. — Divakar, Apr 27 '17 at 16:39

score 21 · Answer 2 · edited May 23 '17 at 11:54

21

Benchmarks

Updated for R2015b: repelem now fastest for all data sizes.

Tested functions:

MATLAB's built-in repelem function that was added in R2015a
gnovice's cumsum solution (rld_cumsum)
Divakar's cumsum+diff solution (rld_cumsum_diff)
knedlsepp's accumarray solution (knedlsepp5cumsumaccumarray) from this post
Naive loop-based implementation (naive_jit_test.m) to test the just-in-time compiler

Results of test_rld.m on R2015b:

Old timing plot using R2015a here.

Findings:

repelem is always the fastest by roughly a factor of 2.
rld_cumsum_diff is consistently faster than rld_cumsum.
~~repelem is fastest for small data sizes (less than about 300-500 elements)~~
~~rld_cumsum_diff becomes significantly faster than repelem around 5 000 elements~~
~~repelem becomes slower than rld_cumsum somewhere between 30 000 and 300 000 elements~~
rld_cumsum has roughly the same performance as knedlsepp5cumsumaccumarray
naive_jit_test.m has nearly constant speed and on par with rld_cumsum and knedlsepp5cumsumaccumarray for smaller sizes, a little faster for large sizes

Old rate plot using R2015a here.

Conclusion

Use repelem ~~below about 5 000 elements and the cumsum+diff solution above~~.

edited May 23 '17 at 11:54

Community

1
1

answered Mar 16 '15 at 18:14

chappjc

30,359
6
75
132

Thank you! Interesting processing rate plot! – Divakar Mar 16 '15 at 18:21
@Divakar Thanks. It was a little hard to see the transitions in the `loglog` plot of runtimes. – chappjc Mar 16 '15 at 18:21
2

It would be nice to add the fastest function from [this similar question](http://stackoverflow.com/a/28502898/2586922) here, namely `knedlsepp5cumsumaccumarray`. To be fair, it would have to be stripped down to `V = cumsum(accumarray(cumsum([1; runLengths(:)]), 1)); V = values(V(1:end-1));` (no checking etc). On my R2014b it's slower, though – Luis Mendo Mar 16 '15 at 21:56
My code is basically the same as *gnovice*'s, as *Divakar* pointed me to it when researching this topic. The only difference is the use of `accumarray` instead of `index(...) = 1`, to make his code work when the `runLengths` array includes zeros. That's why those perform quite similarly. – knedlsepp Mar 16 '15 at 23:56
@knedlsepp I saw, and I approved! :D – chappjc Mar 17 '15 at 00:01
1

@chappjc Thanks for adding JIT and knedlsepp's solutions! – Divakar Mar 17 '15 at 03:40
3

Seems like `for` will be sitting at the cool kids' table after all! – knedlsepp Mar 17 '15 at 09:58
3

@knedlsepp It almost pains me to write code like that in MATLAB, but that's shows how much it has improved... It's like writing in C. Very odd. – chappjc Mar 17 '15 at 15:24
1

@Divakar Hey, thanks for the bounty award! I've been on vacation and have't really looked closely at my profile. A nice surprise! – chappjc Mar 31 '15 at 04:32

gnovice · Answer 3 · 2012-10-12T17:52:16.240

There's no built-in function I know of, but here's one solution:

index = zeros(1,sum(b));
index([1 cumsum(b(1:end-1))+1]) = 1;
c = a(cumsum(index));

Explanation:

A vector of zeroes is first created of the same length as the output array (i.e. the sum of all the replications in b). Ones are then placed in the first element and each subsequent element representing where the start of a new sequence of values will be in the output. The cumulative sum of the vector index can then be used to index into a, replicating each value the desired number of times.

For the sake of clarity, this is what the various vectors look like for the values of a and b given in the question:

        index = [1 0 1 0 1 1 0 0]
cumsum(index) = [1 1 2 2 3 4 4 4]
            c = [1 1 3 3 2 5 5 5]

EDIT: For the sake of completeness, there is another alternative using ARRAYFUN, but this seems to take anywhere from 20-100 times longer to run than the above solution with vectors up to 10,000 elements long:

c = arrayfun(@(x,y) x.*ones(1,y),a,b,'UniformOutput',false);
c = [c{:}];

Thanks gnovice! This code is about 15x faster than what I had before. — Doresoom, Dec 29 '09 at 17:58

chappjc · Answer 4 · 2015-03-06T19:06:43.143

There is finally (as of R2015a) a built-in and documented function to do this, repelem. The following syntax, where the second argument is a vector, is relevant here:

W = repelem(V,N), with vector V and vector N, creates a vector W where element V(i) is repeated N(i) times.

Or put another way, "Each element of N specifies the number of times to repeat the corresponding element of V."

Example:

>> a=[1,3,2,5]
a =
     1     3     2     5
>> b=[2,2,1,3]
b =
     2     2     1     3
>> repelem(a,b)
ans =
     1     1     3     3     2     5     5     5

score 3 · Answer 5 · edited Dec 02 '15 at 20:48

3

The performance problems in MATLAB's built-in repelem have been fixed as of R2015b. I have run the test_rld.m program from chappjc's post in R2015b, and repelem is now faster than other algorithms by about a factor 2:

Updated plot comparing repelem execution time of different methods Speedup of repelem over cumsum+diff (about a factor 2)

edited Dec 02 '15 at 20:48

mikkola

3,376
1
19
41

answered Dec 02 '15 at 19:40

Christine Tobler

89
3

Thanks for the information. I've updated my CW post. I never though about it as being a performance _problem_, otherwise I would have submitted a bug report to you. But great news! – chappjc Dec 04 '15 at 22:01

Repeat copies of array elements: Run-length decoding in MATLAB

5 Answers5

Problem Statement

Prospective Approach

Benchmarking

Conclusions

Explanation:

Linked

Related