1

I have a set of measurements in a data frame which I need to add in n number of rows of "NA" at every nth row. The reason is that I have a set of measurements that were taken only at every nth period of time and I will like to plot it to reflect the actual total time.

Hope there is some expert who can help me with this? I have been trying to figure this out for a very long time. Thank you.

For example, for an initial data frame below.

        x     y
 row1   1     2
 row2   1     2
 row3   1     2     
 row4   1     2
 row5   1     2
 row6   1     2
 row7   1     2
 row8   1     2
 row9   1     2
 row10  1     2

Suppose I want to add 2 "NA" rows at every other 2 rows:

        x     y
 row1   1     2
 row2   1     2
 row3   NA    NA
 row4   NA    NA
 row5   1     2
 row6   1     2
 row7   NA    NA
 row8   NA    NA
 row9   1     2
 row10  1     2 
 row11   NA    NA
 row12   NA    NA
 row13   1     2
 row14   1     2 
 row15   NA    NA
 row16   NA    NA
 row17   1     2
 row18   1     2 
user438383
  • 5,716
  • 8
  • 28
  • 43
Lozenges
  • 11
  • 1
  • 1
    Use your 'time' variable to complete the sequence. Have a look at [this](https://stackoverflow.com/questions/16787038/insert-rows-for-missing-dates-times) – Sotos Apr 08 '22 at 09:59
  • Possible duplicate: https://stackoverflow.com/questions/43403282/add-row-in-each-group-using-dplyr-and-add-row – Martin Gal Apr 08 '22 at 10:14

2 Answers2

2

You could use a custom grouping together with group_modify:

library(dplyr)

df %>% 
  group_by(grp = (row_number() + 1) %/% 2 ) %>% 
  group_modify(~ add_row(.x, x = rep(NA, 2))) %>% 
  ungroup() %>% 
  select(-grp)

This returns

# A tibble: 20 x 2
       x     y
   <dbl> <dbl>
 1     1     2
 2     1     2
 3    NA    NA
 4    NA    NA
 5     1     2
 6     1     2
 7    NA    NA
 8    NA    NA
 9     1     2
10     1     2
11    NA    NA
12    NA    NA
13     1     2
14     1     2
15    NA    NA
16    NA    NA
17     1     2
18     1     2
19    NA    NA
20    NA    NA
Martin Gal
  • 16,640
  • 5
  • 21
  • 39
1

You could split at the cumsum of row indeces modulo 2 using by and rbind a matrix with the desired dimensions and same dimnames.

by(dat, cumsum(seq_len(nrow(dat)) %% 2), \(x) 
   rbind(x, matrix(,2, 2, dimnames=list(NULL, names(x))))) |>
  do.call(what=rbind) |>
  {\(.) `rownames<-`(., paste0('row', seq_len(nrow(.))))}()  ## optional
#        x  y
# row1   1  2
# row2   1  2
# row3  NA NA
# row4  NA NA
# row5   1  2
# row6   1  2
# row7  NA NA
# row8  NA NA
# row9   1  2
# row10  1  2
# row11 NA NA
# row12 NA NA
# row13  1  2
# row14  1  2
# row15 NA NA
# row16 NA NA
# row17  1  2
# row18  1  2
# row19 NA NA
# row20 NA NA

Note: R >= 4.1 used.


Data:

dat <- structure(list(x = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), 
    y = c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L)), class = "data.frame", row.names = c("row1", 
"row2", "row3", "row4", "row5", "row6", "row7", "row8", "row9", 
"row10"))
jay.sf
  • 60,139
  • 8
  • 53
  • 110