-3

I can't figure out what the syntax is to delete rows of a data frame that are of a certain value, plus the next two rows under them. Can anyone help?

Cheers

luke123
  • 631
  • 3
  • 9
  • 15
  • 4
    Please give sample data or reproducible example so that good people here can help you better. See http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – CHP Mar 12 '13 at 08:27

2 Answers2

2

Here is a (not very elegant) way to do it :

# Sample data
df <- data.frame(x=c(1:5,1:5),y=rnorm(10))
# Computing selection
select <- rep(TRUE, nrow(df))
index <- which(df$x==3)
select[unique(c(index,index+1,index+2))] <- FALSE
# Rows selection
df[select,]

Which gives :

  x          y
1 1 -0.2438523
2 2 -0.8004811
6 1  0.5970947
7 2  1.8124529
juba
  • 47,631
  • 14
  • 113
  • 118
1

Just another way. You can create a tiny utility function that cyclic shifts your vector and OR's them as many times as the amount of values you would want to remove from the position of match.

cyclic_or_shift <- function(x, times) {
    for (i in 1:times)
        x <- x | c(FALSE, head(x, -1))
    x   
}

set.seed(45)
df <- data.frame(x=c(10,20,3,40,50,3,60,70,80), y=rnorm(9))
df[!(cyclic_or_shift(df$x == 3, 2)),]

#    x          y
# 1 10  0.3407997
# 2 20 -0.7033403
# 9 80  1.8090374

Advantage: You can use it to remove any amount of consecutive rows:

set.seed(45)
df <- data.frame(x=c(1,2,3,4,5,6,7,3,8,9,10,3,11,12,13,3))
df$y <- rnorm(nrow(df))
# > df
#     x          y
# 1   1  0.3407997
# 2   2 -0.7033403
# 3   3 -0.3795377
# 4   4 -0.7460474
# 5   5 -0.8981073
# 6   6 -0.3347941
# 7   7 -0.5013782
# 8   3 -0.1745357
# 9   8  1.8090374
# 10  9 -0.2301050
# 11 10 -1.1304182
# 12  3  0.2159889
# 13 11  1.2322373
# 14 12  1.6093587
# 15 13  0.4015506
# 16  3 -0.2729840

# remove the next 3 elements as well from every matching index
df[!(cyclic_or_shift(df$x == 3, 3)),]
#   x          y
# 1 1  0.3407997
# 2 2 -0.7033403
# 7 7 -0.5013782
Arun
  • 116,683
  • 26
  • 284
  • 387