Questions tagged [run-length-encoding]

Run-length encoding (RLE) is a very simple form of data compression in which runs of data (that is, sequences in which the same data value occurs in many consecutive data elements) are stored as a single data value and count, rather than as the original run.

Run-length encoding is most useful on data that contains many such runs: for example, simple graphic images such as icons, line drawings, and animations. It is not useful with files that don't have many runs as it could greatly increase the file size.

254 questions
36
votes
6 answers

Is there a dplyr equivalent to data.table::rleid?

data.table offers a nice convenience function, rleid for run-length encoding: library(data.table) DT = data.table(grp=rep(c("A", "B", "C", "A", "B"), c(2, 2, 3, 1, 2)), value=1:10) rleid(DT$grp) # [1] 1 1 2 2 3 3 3 4 5 5 I can mimic this in base R…
JasonAizkalns
  • 20,243
  • 8
  • 57
  • 116
25
votes
3 answers

Create counter within consecutive runs of values

I wish to create a sequential number within each run of equal values, like a counter of occurrences, which restarts once the value in the current row is different from the previous row. Please find an example of input and expected output…
Richard
  • 1,224
  • 3
  • 16
  • 32
19
votes
2 answers

Use rle to group by runs when using dplyr

In R, I want to summarize my data after grouping it based on the runs of a variable x (aka each group of the data corresponds to a subset of the data where consecutive x values are the same). For instance, consider the following data frame, where I…
josliber
  • 43,891
  • 12
  • 98
  • 133
18
votes
7 answers

Element-wise array replication in Matlab

Let's say I have a one-dimensional array: a = [1, 2, 3]; Is there a built-in Matlab function that takes an array and an integer n and replicates each element of the array n times? For example calling replicate(a, 3) should return…
Dima
  • 38,860
  • 14
  • 75
  • 115
16
votes
5 answers

Repeat copies of array elements: Run-length decoding in MATLAB

I'm trying to insert multiple values into an array using a 'values' array and a 'counter' array. For example, if: a=[1,3,2,5] b=[2,2,1,3] I want the output of some function c=somefunction(a,b) to be c=[1,1,3,3,2,5,5,5] Where a(1) recurs b(1)…
Doresoom
  • 7,398
  • 14
  • 47
  • 61
13
votes
4 answers

Run-length decoding in MATLAB

For clever usage of linear indexing or accumarray, I've sometimes felt the need to generate sequences based on run-length encoding. As there is no built-in function for this, I am asking for the most efficient way to decode a sequence encoded in…
knedlsepp
  • 6,065
  • 3
  • 20
  • 41
11
votes
2 answers

Lossless hierarchical run length encoding

I want to summarize rather than compress in a similar manner to run length encoding but in a nested sense. For instance, I want : ABCBCABCBCDEEF to become: (2A(2BC))D(2E)F I am not concerned that an option is picked between two identical possible…
11
votes
4 answers

Create counter for runs of TRUE among FALSE and NA, by group

I have a little nut to crack. I have a data.frame where runs of TRUE are separated by runs of one or more FALSE or NA: group criterium 1 A NA 2 A TRUE 3 A TRUE 4 A TRUE 5 A FALSE 6 A …
Humpelstielzchen
  • 6,126
  • 3
  • 14
  • 34
11
votes
2 answers

Find start and end positions/indices of runs/consecutive values

Problem: Given an atomic vector, find the start and end indices of runs in the vector. Example vector with runs: x = rev(rep(6:10, 1:5)) # [1] 10 10 10 10 10 9 9 9 9 8 8 8 7 7 6 Output from rle(): rle(x) # Run Length Encoding # lengths:…
Clara
  • 411
  • 4
  • 13
11
votes
6 answers

MATLAB repeat numbers based on a vector of lengths

Is there a vectorised way to do the following? (shown by an example): input_lengths = [ 1 1 1 4 3 2 1 ] result = [ 1 2 3 4 4 4 4 5 5 5 6 6 7 ] I have spaced out the input_lengths so it is easy to understand how the result is…
Samuel O'Malley
  • 3,471
  • 1
  • 23
  • 41
10
votes
11 answers

Efficiently find the first of the last 1's sequence

I have the following vectors with 0s and 1s: test1 <- c(rep(0,20),rep(1,5),rep(0,10),rep(1,15)) test1 [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 …
one
  • 3,121
  • 1
  • 4
  • 24
8
votes
4 answers

Create group names for consecutive values

Looks like an easy task, can't figure out a simpler way. I have an x vector below, and need to create group names for consecutive values. My attempt was using rle, better ideas? # data x <- c(1,1,1,2,2,2,3,2,2,1,1) # make…
zx8754
  • 52,746
  • 12
  • 114
  • 209
8
votes
4 answers

Element-wise array replication according to a count

My question is similar to this one, but I would like to replicate each element according to a count specified in a second array of the same size. An example of this, say I had an array v = [3 1 9 4], I want to use rep = [2 3 1 5] to replicate the…
merv
  • 1,449
  • 3
  • 13
  • 25
7
votes
3 answers

Binary run length encoding

I have a web form, for the contents of which I would like to generate a short representation in Base64. The form, among other things, contains a list of 264 binary values, the greater part of which are going to be 0 at any single time. (They…
avramov
  • 2,119
  • 2
  • 18
  • 41
7
votes
1 answer

Decoding RLE (run-length encoding) mask with Tensorflow Datasets

I have been experimenting with tensorflow Datasets but I cannot figure out how to efficiently create RLE-masks. FYI, I am using data from the Airbus Ship Detection Challenge in Kaggle: https://www.kaggle.com/c/airbus-ship-detection/data I know my…
Alex
  • 445
  • 4
  • 13
1
2 3
16 17