1

I'm trying to add additional rows to my data table with the column totals so that when I display on ggplot, I am able to filter by "Total" for my selectInput in my Shiny app. However, because I have various data types (i.e. date, string and numeric), it makes it more complicated.

Here's a sample df:

data.frame(
  Date = rep(seq(as.Date("2018-01-01"), by= "1 day", length.out= 3), 3),
  Company = c("A", "A", "A", "B", "B", "B", "C", "C", "C"),
  Attr_1 = c("AB", "AC", "AD", "AB", "AC", "AD", "AB", "AC", "AD"),
  Attr_2 = c(1,2,3,4,5,6,7,8,9)
)

Here's what I'm hoping to achieve:

Date       Company Attr_1 Attr_2
2018-01-01       A     AB      1
2018-01-02       A     AC      2
2018-01-03       A     AD      3
2018-01-01       B     AB      4
2018-01-02       B     AC      5
2018-01-03       B     AD      6
2018-01-01       C     AB      7
2018-01-02       C     AC      8
2018-01-03       C     AD      9
2018-01-01   Total     AB     12
2018-01-02   Total     AC     15
2018-01-03   Total     AD     18

Does anyone have an easy solution for this? What I can think of is to calculate the colSums manually and then rbind back into this dataframe. But is there a simpler solution?

ZPeh
  • 592
  • 1
  • 7
  • 18

4 Answers4

2
df = data.frame(
  Company = c("A", "B", "C", "D", "A", "B"),
  Attr_1 = c(12,13,14,14,3,5),
  Attr_2 = c(1,2,3,4,5,4)
)

library(dplyr)

bind_rows(df, df %>%
               summarise_at(vars(matches("Attr")), funs(sum)) %>%
               mutate(Company = "Total"))

#   Company Attr_1 Attr_2
# 1       A     12      1
# 2       B     13      2
# 3       C     14      3
# 4       D     14      4
# 5       A      3      5
# 6       B      5      4
# 7   Total     61     19

Solution to your edit:

df %>%
  group_by(Date, Attr_1) %>%
  summarise(Attr_2 = sum(Attr_2),
            Company = "Total") %>%
  ungroup() %>%
  bind_rows(df, .)
AntoniosK
  • 15,991
  • 2
  • 19
  • 32
  • Thanks @AntoniosK, but this solution doesn't work for my revised problem. May I know if you have a solution to the edited problem instead? – ZPeh Sep 26 '18 at 15:40
  • I've updated my answer... – AntoniosK Sep 26 '18 at 16:10
  • Thanks for your solution! I've tried it but unfortunately, the "Total" column is in the format of character, while my previous Company column comprises of factors and I got the following warning message: "binding factor and character vector, coercing into character vector". Even after adding factor("Total", levels="Total") into the summarise argument, I'm still getting the following warning: "In bind_rows_(x, .id) : Unequal factor levels: coercing to character" In the end, I decided to use rbind instead of bind_rows. Is there a reason why I rbind works but bind_rows doesn't? – ZPeh Sep 27 '18 at 02:04
  • Warning is not an error. The job is still done! But if you want to eliminate warnings try to use `character` variables instead of `factors` from the beginning. Can you do that, or you want me to update my answer? – AntoniosK Sep 27 '18 at 09:51
  • I'm able to do that myself, thanks a lot for the clarification! Appreciate it! – ZPeh Sep 29 '18 at 12:23
1

A solution that works even if there is a 'W' company.

data.frame(
  Company = c("A", "B", "W", "D", "A", "B"),
  Attr_1 = c(12,13,14,14,3,5),
  Attr_2 = c(1,2,3,4,5,4),
  stringsAsFactors=FALSE
) -> df

df %>% summarise_if(is.numeric,sum) %>%
  mutate(Company='Total') %>% 
  bind_rows(df,.)
#  Company Attr_1 Attr_2
#1       A     12      1
#2       B     13      2
#3       W     14      3
#4       D     14      4
#5       A      3      5
#6       B      5      4
#7   Total     61     19
Nicolas2
  • 2,170
  • 1
  • 6
  • 15
  • Thanks @Nicholas2, may I know if you have any solution to my updated problem? – ZPeh Sep 26 '18 at 15:41
  • It is just a matter of grouping on Date and Attr_1 and summarizing Attr_2 before row_binding. But before trying anything are you sure that there is only one value for Attr_1 for each Date? – Nicolas2 Sep 27 '18 at 07:46
  • There may be more than 1 value for Attr_1 for each date, but I manage to figure out a solution for it. Thanks for your help! – ZPeh Sep 29 '18 at 12:24
1

Here's a base R solution:

df <- data.frame(
  Company = c("A", "B", "C", "D", "A", "B"),
  Attr_1 = c(12,13,14,14,3,5),
  Attr_2 = c(1,2,3,4,5,4)
)
rbind(df, data.frame(Company = "Total", Attr_1 = sum(df$Attr_1), Attr_2 = sum(df$Attr_2)))

Output:

  Company Attr_1 Attr_2
1       A     12      1
2       B     13      2
3       C     14      3
4       D     14      4
5       A      3      5
6       B      5      4
7   Total     61     19
symbolrush
  • 7,123
  • 1
  • 39
  • 67
1

I find adorn_totals from the janitorpackage very useful for this (and other) tasks

library( janitor )

df %>% adorn_totals()

# Company Attr_1 Attr_2
#       A     12      1
#       B     13      2
#       C     14      3
#       D     14      4
#       A      3      5
#       B      5      4
#   Total     61     19
Wimpel
  • 26,031
  • 1
  • 20
  • 37