Delete rows with equal and consecutive values

Question

Consider this table

var1    var2        var3
565 P0049129/21     146
565 P0020151/04     146

I would like to go over this table, find consecutive lines where var3 equals the same value (146 in this example) and remove one of those lines.

Note that in this table there are other rows where var3=146 and I want to keep those rows. I just want to remove duplication when var3 has the same value on two consecutive rows.

Thanks

score 2 · Answer 1 · answered Sep 05 '19 at 14:33

2

We can use rleid and then duplicated.

df[!duplicated(data.table::rleid(df$var3)), ]

This will keep only the first row of consecutive values and delete the rest.

answered Sep 05 '19 at 14:33

Ronak Shah

377,200
20
156
213

akrun · Answer 2 · 2019-09-05T14:41:52.480

We can use rleid to find the groups that are same

library(data.table)
i1 <- setDT(df1)[,  .I[1],rleid(var3)]$V1
df1[i1]
#     var1        var2 var3
#@1:  565 P0049129/21  146

Or another option is

library(dplyr)
df1 %>%
   group_by(grp = cumsum(var3 != lag(var3, default = first(var3)))) %>%
   slice(1) %>%
   ungroup %>%
   select(-grp)
# A tibble: 1 x 3
#   var1 var2   var3
#  <int> <chr>       <int>
#1   565 P0049129/21   146

Or we can do this in. base R

grp <- with(rle(df$var3), rep(seq_along(values), lengths))
subset(df, !duplicated(grp))

data

df <- structure(list(var1 = c(565L, 565L), var2 = c("P0049129/21", 
"P0020151/04"), var3 = c(146L, 146L)), class = "data.frame", row.names = c(NA, 
-2L))

score 1 · Accepted Answer · answered Sep 05 '19 at 14:33

1

library(data.table)
setDT(df)

df[rowid(rleid(var3)) == 1]

answered Sep 05 '19 at 14:33

IceCreamToucan

28,083
2
22
38

Delete rows with equal and consecutive values

3 Answers3

data