2

I have a zoo dataset which is indexed on time. Some rows have same data, so I basically want to remove these repeated rows only.

                       redpardiff        relpar
2012-07-05 10:19:38 -9.531491e-05  4.727280e-07
2012-07-05 10:19:41 -9.531491e-05  4.727280e-07
2012-07-05 10:19:47 -9.531491e-05  4.727280e-07
2012-07-05 10:19:47 -9.531491e-05 -9.999995e-01
2012-07-05 10:19:47 -9.531491e-05  1.000000e+00
2012-07-05 10:19:49 -9.531491e-05 -9.999995e-01

After removal it should look like

                       redpardiff        relpar
2012-07-05 10:19:38 -9.531491e-05  4.727280e-07
2012-07-05 10:19:47 -9.531491e-05 -9.999995e-01
2012-07-05 10:19:47 -9.531491e-05  1.000000e+00
2012-07-05 10:19:49 -9.531491e-05 -9.999995e-01

Doing a sequential comparison is very slow. Is there a better way of doing such an exercise?

structure(c(-9.53149088309146e-05, -9.53149088309146e-05, -9.53149088309146e-05, 
-9.53149088309146e-05, -9.53149088309146e-05, -9.53149088309146e-05, 
4.72727990086241e-07, 4.72727990086241e-07, 4.72727990086241e-07, 
-0.99999952727201, 1.00000047272799, -0.99999952727201), .Dim = c(6L, 
2L), .Dimnames = list(NULL, c("redpardiff", "relpar")), index = structure(c(1341463778.55163, 
1341463781.40801, 1341463787.2642, 1341463787.52668, 1341463787.78777, 
1341463789.36693), class = c("POSIXct", "POSIXt")), class = "zoo")

edited: "obviously on time"

shoonya
  • 292
  • 1
  • 10

1 Answers1

3
 x[c(1,which(rowSums(abs(diff(x)))!=0)+1),]
                       redpardiff        relpar
2012-07-05 05:49:38 -9.531491e-05  4.727280e-07
2012-07-05 05:49:47 -9.531491e-05 -9.999995e-01
2012-07-05 05:49:47 -9.531491e-05  1.000000e+00
2012-07-05 05:49:49 -9.531491e-05 -9.999995e-01
James
  • 65,548
  • 14
  • 155
  • 193