0

I have a data.frame of 176800 observations. I want to remove rows when it fulfills the following condition:

full_data_string_split$split1=="SP#" & full_data_string_split$split2=="11"

I know this condition is fullfilled in 425 cases and it deletes 425 rows when I do the following:

full_data_string_split_removed1 <- full_data_string_split[!(full_data_string_split$split1 == "SP#" & full_data_string_split$split2 ==  "11"), ]

My question now is: How do I delete the row with the given condition and the following 207 rows to reduce my data.frame to 88400 observations?

JasonAizkalns
  • 20,243
  • 8
  • 57
  • 116
Johannes
  • 69
  • 1
  • 7
  • Possible duplicate of [Filtering a data.frame](https://stackoverflow.com/questions/1686569/filtering-a-data-frame) – Maurits Evers Oct 26 '17 at 01:11
  • `@MauritsEvers`, no this is not quite what I want. I think my Question is not duplicating what you suggested. – Johannes Oct 26 '17 at 01:17
  • Without any sample data and code it's difficult to infer what you want. So you want to remove rows that fulfil a certain condition *plus the subsequent 207 rows*? In that case you can get the index of the last condition-matching row `max(which()`, and then create a new range of indices by adding 207 to that last matching row position. – Maurits Evers Oct 26 '17 at 01:23
  • @MauritsEvers agree that more sample code is required, but based on starting with 176,800 observations and the desired result is 88,400, we can infer that it's the subsequent 207 rows. 176,800 - (208*425) = 88,400. – JasonAizkalns Oct 26 '17 at 01:26
  • @JasonAizkalns Yes you're right. Clearly too lazy to do the math... +1 to your solution plus explanation. – Maurits Evers Oct 26 '17 at 01:31

2 Answers2

2

This would be a lot easier to solve with a reproducible example, but I believe this should work:

indicies_with_match <- which(full_data_string_split$split1 == "SP#" & full_data_string_split$split2=="11")
indicies_to_remove <- indicies_with_match + rep(0:207, each = length(indicies_with_match))

results <- full_data_string_split[-indicies_to_remove, ]

Be sure to check out ?which and then consider x <- c(1, 5, 9) and what happens when you do x + rep(0:2, each = length(x))

JasonAizkalns
  • 20,243
  • 8
  • 57
  • 116
  • If I understand correctly, if you change `length(x)` to `length(indicies_with_match)`, it might work for OP. I think OP had an issue reproducing it just like I did. – CPak Oct 26 '17 at 01:43
1

Maybe this will work?

result <- which(airquality$Month == 5 & airquality$Temp == 67)
keepalso <- 3
keep <- Reduce("c", apply(cbind(result, result+keepalso), 1, function(x) c(x[1]:x[2])))
airquality[keep,]

   # Ozone Solar.R Wind Temp Month Day
# 1     41     190  7.4   67     5   1
# 2     36     118  8.0   72     5   2
# 3     12     149 12.6   74     5   3
# 4     18     313 11.5   62     5   4
# 28    23      13 12.0   67     5  28
# 29    45     252 14.9   81     5  29
# 30   115     223  5.7   79     5  30
# 31    37     279  7.4   76     5  31

Adapting to your case

result <- which(full_data_string_split$split1=="SP#" & full_data_string_split$split2=="11")
discardalso <- 207
discard <- Reduce("c", apply(cbind(result, result+discardalso), 1, function(x) c(x[1]:x[2])))
full_data_string_split[-discard,]
CPak
  • 13,260
  • 3
  • 30
  • 48