3

I am stuck with question - how to sum consecutive duplicate odd rows and remove all but first row. I have got how to sum consecutive duplicate rows and remove all but first row (link: https://stackoverflow.com/a/32588960/11323232). But this project, i would like to sum the consecutive duplicate odd rows but not all of the consecutive duplicate rows.

 ia<-c(1,1,2,NA,2,1,1,1,1,2,1,2)
 time<-c(4.5,2.4,3.6,1.5,1.2,4.9,6.4,4.4, 4.7, 7.3,2.3, 4.3)
 a<-as.data.frame(cbind(ia, time))

  a
   ia time
1   1  4.5
2   1  2.4
3   2  3.6
5   2  1.2
6   1  4.9
7   1  6.4
8   1  4.4
9   1  4.7
10  2  7.3
11  1  2.3
12  2  4.3

to 

 a
   ia time
1   1  6.9
3   2  3.6
5   2  1.2
6   1  20.4
10  2  7.3
11  1  2.3
12  2  4.3

how to edit the following code for my goal to sum consecutive duplicate odd rows and remove all but first row ?

 result <- a %>%
 filter(na.locf(ia) == na.locf(ia, fromLast = TRUE)) %>%
 mutate(ia = na.locf(ia)) %>%
 mutate(change = ia != lag(ia, default = FALSE)) %>%
 group_by(group = cumsum(change), ia) %>%
 # this part
 summarise(time = sum(time))
hees
  • 71
  • 5
  • 1
    Thanks tmfmnk and Patrik_P. I have anther problems. if the 'time' is the same length of list. how can i do it? such as time <- list(c(4.5,2), 2.4,3.6,1.5,1.2,4.9,6.4,4.4, 4.7, 7.3,2.3, 4.3) . – hees Apr 08 '19 at 09:28

2 Answers2

1

One dplyr possibility could be:

a %>%
 group_by(grp = with(rle(ia), rep(seq_along(lengths), lengths))) %>%
 mutate(grp2 = ia %/% 2 == 0,
        time = sum(time)) %>%
 filter(!grp2 | (grp2 & row_number() == 1)) %>%
 ungroup() %>%
 select(-grp, -grp2)

      ia  time
  <dbl> <dbl>
1     1   6.9
2     2   3.6
3     2   1.2
4     1  20.4
5     2   7.3
6     1   2.3
7     2   4.3
tmfmnk
  • 38,881
  • 4
  • 47
  • 67
0

You could try with use of data.table the following:

library(data.table)
ia <- c(1,1,2,NA,2,1,1,1,1,2,1,2)
time <- c(4.5,2.4,3.6,1.5,1.2,4.9,6.4,4.4, 4.7, 7.3,2.3, 4.3)
a <- data.table(ia, time)
a[, sum(time), by=.(ia, rleid(!ia %% 2 == 0))]

Gives

##   ia rleid   V1
##1:  1     1  6.9
##2:  2     2  3.6
##3: NA     3  1.5
##4:  2     4  1.2
##5:  1     5 20.4
##6:  2     6  7.3
##7:  1     7  2.3
##8:  2     8  4.3
Patrik_P
  • 3,066
  • 3
  • 22
  • 39
  • great! Thanks Hi patrik_P. if the 'time' is the same length of list. how can i do it? such as time <- list(c(4.5,2), 2.4,3.6,1.5,1.2,4.9,6.4,4.4, 4.7, 7.3,2.3, 4.3) . – hees Apr 08 '19 at 09:24
  • In case when u assign a list as column in data.table you could do `time2 <- list(c(4.5,2), 2.4,3.6,1.5,1.2,4.9,6.4,4.4, 4.7, 7.3,2.3, 4.3); a2 <- data.table(ia, time2); a2[, .(SUMtime = sum(unlist(time2))), by=.(ia, rleid(!ia %% 2 == 0))]` – Patrik_P Apr 08 '19 at 10:17
  • Thanks again. There may be some misleading. Could you please see the following question? i have push a new question that describe the update request. https://stackoverflow.com/q/55571484/11323232 – hees Apr 08 '19 at 10:48