I would like to shrink down the duplicate data by keeping the row order.
Input
x 1 2 3 3 2 2 3 1 1
to
x 1 2 3 2 3 1
what would be the suitable function for such operation?
Thank you
I would like to shrink down the duplicate data by keeping the row order.
Input
x 1 2 3 3 2 2 3 1 1
to
x 1 2 3 2 3 1
what would be the suitable function for such operation?
Thank you
one option is to calculate the vector of diferences, increment a number != 0 at the beginning of this vector (beause the first value can not generate any diference as it has no precessor) and use this to filter x for values where the diference is != 0:
x <- c( 1, 2, 3, 3, 2, 2, 3, 1, 1)
x[c(1,diff(x)) != 0]
[1] 1 2 3 2 3 1
since you mentioned data.frame
here is one way to solve this within the tidyverse, given that x is the colum of the df. We can use the lag() function to call the preceding row value:
library(dplyr)
data.frame(x) %>%
# calculate diferences between rows of X and filter for those where the diference != 0 or NA (first row)
dplyr::filter(lag(x)-x != 0 | is.na(lag(x)))
x
1 1
2 2
3 3
4 2
5 3
6 1