-4

I have a dataframe and I would sum adjacent value until there is a 0 for each column. The output should show the maximum achievable value for each column.

The dataframe should be like this:

    A B C D
X1  0 1 0 1
X2  1 0 1 1
X3  0 1 1 1
X4  1 1 1 1
SUM 1 2 3 4
NelsonGon
  • 13,015
  • 7
  • 27
  • 57
  • See `help("rle")`. – Roland Jun 07 '19 at 10:22
  • Hi Alfredo, welcome to SO! Please have a look at [link](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610) to improve your question and avoid the downvotes. – Sven Jun 07 '19 at 10:23

1 Answers1

0

1) We assume that:

  • the input consists of a data frame of 0's and 1's as suggested by the example input in the question. (For the example, we use the data frame shown reproducibly in the Note at the end.)

  • we want the length of the longest run of 1's

  • if there are no 1's in a column that the sum for that column should be 0. The solution could be simplified slightly if we knew that all columns have at least one 1 but we do not make that assumption here.

We use the following code. No packages are used.

sum1 <- function(x) with(rle(x), max(lengths[values == 1], 0))
apply(DF, 2, sum1)
##     A B C D 
##     1 2 3 4 

See ?rle for more information on computing the length of runs.

2) Here we do not assume that that the data frame consists of just 0's and 1's. For each stretch of non-zero's we sum them and return the largest sum. This uses the data.table package's rleid.

library(data.table)

sum2 <- function(x) max(tapply(x, rleid(x != 0), sum))
apply(DF, 2, sum2)
##     A B C D 
##     1 2 3 4 

Note

The input DF shown in reproducible form is assumed to be:

Lines <- "    A B C D
X1  0 1 0 1
X2  1 0 1 1
X3  0 1 1 1
X4  1 1 1 1"
DF <- read.table(text = Lines)
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341