-1

I have a column with values and want to check if the sum of 5 consecutive values (within a certain range - row 259 to row 272) is > 10 and if at least two out of the 5 values are > 3

This is what I used to come up with the sum of 5 consecutive values. It divides my range into twelve blocks and check each block individually.

data <- read.table("....csv", header=TRUE, sep=",", na.strings="NA", dec=".", strip.white=TRUE)
interval <- 5
start <- 259
end <- 272
block<-sapply(start:(end-interval+1),function(x){sum(data[x:(x+interval-1)])})

Now I check if the value of the block is > 10

if ( block [[1]]> 10 ) {
  print(paste("block to fulfill the condition is block", 1))
} else if ( block [[2]]> 10 ) {
  print(paste("block to fulfill the condition is block", 2))
....

How can I include the condition "two out of 5 values from a block have to be > 3" into my if-clause?

Alwin
  • 117
  • 12
  • What would be your expected output? Those 5 values? – Ronak Shah Nov 23 '16 at 07:35
  • 3
    [How to make a great R reproducible example?](http://stackoverflow.com/questions/5963269) – zx8754 Nov 23 '16 at 07:47
  • @RonakShah My expected output will be the number of the block where the conditions (sum of all 5 values within the block > 10 and at least 2 out of 5 values from the block > 3) are met for the first time – Alwin Nov 23 '16 at 07:52

4 Answers4

2

To make it as reproducible example, I tried it on mtcars dataset.

Row 259 to 272 has been changed to 20 to 30 only for the gear column where the sum is greater than 20 and there are at least 2 out of 5 values having value greater than 3.

library(zoo)
subvec = mtcars[20:30, "gear"]
subvec
#[1] 4 3 3 3 3 3 4 5 5 5 5
idx <- which(rollsum(subvec, 5) > 20 & rollapply(subvec, 5, 
                                           function(x) sum(x > 3)) >= 2)[1]
idx
# [1] 6
subvec[idx:(idx+4)]
#[1] 3 4 5 5 5

So I think this should work on your dataset,

library(zoo)
subvec = data[259:272, "column"]
idx <- which(rollsum(subvec, 5) > 10 & rollapply(subvec,5, 
                                              function(x) sum(x > 3)) >= 2)[1]
subvec[idx:(idx+4)]

As @G.Grothendieck mentioned we can further simplify the code. Instead of using rollapply we can use rollsum on boolean vectors and then sum the ones which are TRUE like

idx <- which(rollsum(subvec, 5) > 10 & rollsum(subvec > 3, 5) >= 2)[1]
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

Having data:

set.seed(1453)
x = sample(-3:7, 13, TRUE)
n = 5
x
# [1]  4  1  6 -1  2  3  5  0  1  4  1  5  5

one approach is:

ex = embed(x, n)
(rowSums(ex) > 10) & (rowSums(ex > 3) >= 2)
#[1]  TRUE FALSE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE

To avoid re-calculating same additions, we could use cumsum:

cs1 = cumsum(x)
cond1 = cs1[n:length(x)] - c(0, cs1[1:(length(x) - n)]) > 10

cs2 = cumsum(x > 3)
cond2 = cs2[n:length(x)] - c(0, cs2[1:(length(x) - n)]) >= 2

cond1 & cond2
#[1]  TRUE FALSE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE
alexis_laz
  • 12,884
  • 4
  • 27
  • 37
0

So I am not quite sure if this is what you want. But here is a function to check for two condition, given a column, start_row, and end_row,

finder = function(column,start_row,end_row,threshold_1 = 10){
  for(i in start_row:end_row){
    if(sum(column[i:(i+4)])> threshold_1){
      if(sum(column[i:(i+4)]>3)>=2){
        print(paste("sum of row",i,"and its 4 consecutive values is greater than", threshold_1))
        print("And at least two out of the 5 values are greater than 3")
        return("END")
      }}}}

This is quite messy, but mostly due to the print messages. And this is what you get:

set.seed(123)
col = sample(1:5, 300, T)
finder(col,259,279)

[1] "sum of row 269 and its 4 consecutive values is greater than 10"
[1] "And at least two out of the 5 values are greater than 3"
[1] "END"
Chris
  • 29,127
  • 3
  • 28
  • 51
0

You can use rollapply from zoo package and intersect as follows,

library(zoo)
ind1 <- which(rollapply(data$v1, 5, by = 1, sum) > 10)
ind2 <- which(rollapply(x, 5, by = 1, function(i)length(i[i>3]) >= 2))
intersect(ind1, ind2)
Sotos
  • 51,121
  • 6
  • 32
  • 66