Fill Blank Rows in Dataframe with Sums for each section of data R

Question

I have data that I separated within a dataframe by item description and type. The separations are blank rows but I would like to fill the blank rows in with the sums of numeric values by each description and, if possible, add another blank row below the sums. Preferably, I would not need to sum the sections of data that only contain one row - see variable desc "a" but not a big deal if I do get a sum there.

This is an example of what I have now:

   desc   type  xvalue yvalue
1      a     z     16      1
2
3      b     y     17      2
4      b     y     18      3
5
6      c     x     19      4
7      c     x     20      5
8      c     x     21      6
9
10     d     x    22      7
11     d     x    23      8
12
13     d     y    24      9
14     d     y    25     10

What I am looking for is output that looks similar to this.

  desc   type  xvalue yvalue
1      a     z     16      1
2
3      b     y     17      2
4      b     y     18      3
5                  35      5 
6
7      c     x     19      4
8      c     x     20      5
9      c     x     21      6
10                 40      15
11
12     d     x    22      7
13     d     x    23      8
14                45      15
15
16     d     y    24      9
17     d     y    25     10
18                49     19

I found an answer on how to do this in a column but not a row. Adding column of summed data by group with empty rows with R

I used acylam's dplyr answer to this question Add blank rows in between existing rows to create the empty rows. I changed the code slightly to fit my data better so my code is:

library(dplyr)
df %>%
  split(df$id, df$group) %>%
  Map(rbind, ., "") %>%
  do.call(rbind, .)

I am hoping I can just add options to the do.call(rbind...) dplyr code I have above.

Do you need this in a data frame or can this be in a table like one produced in GT? — pluke, May 24 '22 at 19:05
@pluke I would prefer either a data frame or matrix if possible. Thanks — zero351, May 24 '22 at 19:19
This is reasonable for a table, but an anti-pattern with actual data (not necessarily a good/normal method) because columns df's are vectors, usually atomic vectors, which can only contain data of one type. While `desc` and `type` can store `""` (empty strings), `xvalue` and `yvalue` are/should be integer type vectors. They can store `NA` or `0`, but not empty strings or empty cells (like one might in Excel). This is all to say, adding context and saying what you are trying to accomplish with the empty rows would help, because there is likely an easier/more structured solution. — bash1000, May 24 '22 at 19:27
@bash I agree with you that having empty rows with actual data is not great. The goal is to have it spaced out/separated for export to excel. — zero351, May 24 '22 at 19:44

score 1 · Accepted Answer · answered May 24 '22 at 19:27

Depending on how your data is organized we could do it this way:

Assuming empty rows are NA's (if not for example they are blank we can make them NA)

we use group_split() after grouping, getting a list, then iterate with map_df over the list using janitor's adorn_totals


library(dplyr)
library(janitor)
df %>% 
  na.omit() %>% # maybe you don't need this line
  group_by(desc, type) %>% 
  group_split() %>% 
  purrr::map_df(., janitor::adorn_totals)

 desc type xvalue yvalue
     a    z     16      1
 Total    -     16      1
     b    y     17      2
     b    y     18      3
 Total    -     35      5
     c    x     19      4
     c    x     20      5
     c    x     21      6
 Total    -     60     15
     d    x     22      7
     d    x     23      8
 Total    -     45     15
     d    y     24      9
     d    y     25     10
 Total    -     49     19

data:

structure(list(desc = c("a", NA, "b", "b", NA, "c", "c", "c", 
NA, "d", "d", NA, "d", "d"), type = c("z", NA, "y", "y", NA, 
"x", "x", "x", NA, "x", "x", NA, "y", "y"), xvalue = c(16L, NA, 
17L, 18L, NA, 19L, 20L, 21L, NA, 22L, 23L, NA, 24L, 25L), yvalue = c(1L, 
NA, 2L, 3L, NA, 4L, 5L, 6L, NA, 7L, 8L, NA, 9L, 10L)), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", 
"14"))

score 0 · Answer 2 · answered May 24 '22 at 20:11

Here's a full answer adding empty rows and removing janitor's added stuff from @TarJae's answer:

library(dplyr)
library(janitor)

df <- df %>% 
    na.omit() %>% # maybe you don't need this line
    group_by(desc, type) %>% 
    group_split() %>% 
    purrr::map_df(., \(x) {x <- x %>% janitor::adorn_totals() %>% rbind(NA)}) %>% 
    mutate(
        desc = ifelse(desc == "Total", NA, desc),
        type = ifelse(type == "-", NA, type)
    )

Fill Blank Rows in Dataframe with Sums for each section of data R

2 Answers2