1
    a      b     c             d
   5015  3.49 1059.500       0.00
   5023  2.50 6056.000       2.50
   5024  3.00 1954.500       3.00
   5026  3.49 1163.833       0.00
   5037  2.50 6797.000       2.50
   5038  3.00 2109.000       3.00
   5040  2.50 4521.000       2.50
   5041  3.33 2469.000       3.33

I want to repeat previously observed row with column 0 before a row non zero value of d. so, I will get rows with alternate rows of row with zero value of d then non zero value of d. a row with zero value of d must be previously observed row.

Output I want is:

   a     b    c              d      

  5015  3.49 1059.500       0.00    
  5023  2.50 6056.000       2.50    
  5015  3.49 1059.500       0.00    
  5024  3.00 1954.500       3.00    
  5026  3.49 1163.833       0.00    
  5037  2.50 6797.000       2.50    
  5026  3.49 1163.833       0.00    
  5038  3.00 2109.000       3.00    
  5026  3.49 1163.833       0.00    
  5040  2.50 4521.000       2.50    
  5026  3.49 1163.833       0.00    
  5041  3.33 2469.000       3.33    
Frank
  • 66,179
  • 8
  • 96
  • 180
sayali
  • 97
  • 6

2 Answers2

2

We can create a custom function f that will interleave the first row. Split on cumsum(d == 0) creating an index for values equaling 0. Finally we combine with do.call(rbind, ...). I added an optional 'row.names<-'(..., NULL) call to undo the default naming convention:

f <- function(x) x[c(rbind(rep(1,nrow(x)-1), 2:nrow(x))),]
`row.names<-`(do.call(rbind, lapply(split(df1, cumsum(df1$d == 0)), f)), NULL)
#       a    b        c    d
# 1  5015 3.49 1059.500 0.00
# 2  5023 2.50 6056.000 2.50
# 3  5015 3.49 1059.500 0.00
# 4  5024 3.00 1954.500 3.00
# 5  5026 3.49 1163.833 0.00
# 6  5037 2.50 6797.000 2.50
# 7  5026 3.49 1163.833 0.00
# 8  5038 3.00 2109.000 3.00
# 9  5026 3.49 1163.833 0.00
# 10 5040 2.50 4521.000 2.50
# 11 5026 3.49 1163.833 0.00
# 12 5041 3.33 2469.000 3.33

There is an interleave trick in there. Try c(rbind(c(1,1,1), c(2,3,4))) to see the way the numbers will be weaved together

Pierre L
  • 28,203
  • 6
  • 47
  • 69
  • Here are some other interleaving tricks, for folks' reference: http://stackoverflow.com/q/16443260/1191259 – Frank Mar 16 '16 at 17:01
2

Package data.table's grouping by is useful here:

library(data.table)
DF <-fread("    a      b     c             d
   5015  3.49 1059.500       0.00
                 5023  2.50 6056.000       2.50
                 5024  3.00 1954.500       3.00
                 5026  3.49 1163.833       0.00
                 5037  2.50 6797.000       2.50
                 5038  3.00 2109.000       3.00
                 5040  2.50 4521.000       2.50
                 5041  3.33 2469.000       3.33")

DF[ #find indices:
  DF[, {ind <- .I[rep(1L, (.N - 1) * 2)] #first repeat the first index
      ind[c(FALSE, TRUE)] <- .I[-1] #then replace every second repeat with the other indices
      ind
      }, by = cumsum(abs(d) < .Machine$double.eps^0.5)][["V1"]] #group by the different d = 0 rows, 
                                                                 #beware of floating point errors if you have calculated d
  ] #subset with the indices

#        a    b        c    d
#  1: 5015 3.49 1059.500 0.00
#  2: 5023 2.50 6056.000 2.50
#  3: 5015 3.49 1059.500 0.00
#  4: 5024 3.00 1954.500 3.00
#  5: 5026 3.49 1163.833 0.00
#  6: 5037 2.50 6797.000 2.50
#  7: 5026 3.49 1163.833 0.00
#  8: 5038 3.00 2109.000 3.00
#  9: 5026 3.49 1163.833 0.00
# 10: 5040 2.50 4521.000 2.50
# 11: 5026 3.49 1163.833 0.00
# 12: 5041 3.33 2469.000 3.33
Roland
  • 127,288
  • 10
  • 191
  • 288