0

I have a question regarding to addition of rows from different tables having same column names. I have time series of two tables with values 8760 rows (whole year).

Table1

Name    Year    Month   Day Hour    Value
Plant_1 2020    1   1   1   10
Plant_2 2020    1   1   1   20
Plant_3 2020    1   1   1   30
Plant_1 2020    1   1   2   40
Plant_2 2020    1   1   2   50
Plant_3 2020    1   1   2   60

Table2

Name    Year    Month   Day Hour    Value
Plant_x 2020    1   1   1   1
Plant_y 2020    1   1   1   2
Plant_z 2020    1   1   1   3
Plant_x 2020    1   1   2   4
Plant_y 2020    1   1   2   5
Plant_z 2020    1   1   2   6

What I want is, summation of value of all plants at same time period like

Year    Month   Day Hour    Value
2020    1   1   1   66
2020    1   1   2   165

I don't care about name of plant but need to get sum of total value at each hour of the year. I was trying to do something like this but doesn't work for tables more than two and I have 9 to 10 such tables. Could anyone help me to improve this code or any other function which I can use?

SumOfValue <- Table1%>% 
                full_join(Table2) %>% 
                group_by (Year,Month,Day,Hour) %>% 
                summarise(Value=sum(Value))

Any help would be appreciated. Thank you.

Sotos
  • 51,121
  • 6
  • 32
  • 66
rjt
  • 39
  • 5

1 Answers1

1

It looks like your two dataframes have the same exact format, so you can just rbind them and then get the summary per Year, Month, Day and Hour.

df = rbind(a,b)%>%group_by(Year,Month,Day,Hour)%>%summarise(Value=sum(Value))

# Alternative as suggested by Sotos
bind_rows(a, b) %>%group_by(Year,Month,Day,Hour)%>%summarise(Value=sum(Value))

# A tibble: 2 x 5
# Groups:   Year, Month, Day [?]
   Year Month   Day  Hour Value
  <int> <int> <int> <int> <int>
1  2020     1     1     1    66
2  2020     1     1     2   165

Data

a = structure(list(Name = structure(c(1L, 2L, 3L, 1L, 2L, 3L), .Label = c("Plant_1", 
"Plant_2", "Plant_3"), class = "factor"), Year = c(2020L, 2020L, 
2020L, 2020L, 2020L, 2020L), Month = c(1L, 1L, 1L, 1L, 1L, 1L
), Day = c(1L, 1L, 1L, 1L, 1L, 1L), Hour = c(1L, 1L, 1L, 2L, 
2L, 2L), Value = c(10L, 20L, 30L, 40L, 50L, 60L)), class = "data.frame", row.names = c(NA, 
-6L))

b = structure(list(Name = structure(c(1L, 2L, 3L, 1L, 2L, 3L), .Label = c("Plant_x", 
"Plant_y", "Plant_z"), class = "factor"), Year = c(2020L, 2020L, 
2020L, 2020L, 2020L, 2020L), Month = c(1L, 1L, 1L, 1L, 1L, 1L
), Day = c(1L, 1L, 1L, 1L, 1L, 1L), Hour = c(1L, 1L, 1L, 2L, 
2L, 2L), Value = 1:6), class = "data.frame", row.names = c(NA, 
-6L))
boski
  • 2,437
  • 1
  • 14
  • 30
  • 1
    Might as well keep it all in the 'verse, i.e. `bind_rows(a, b) %>% group_by(...) ...` – Sotos May 09 '19 at 13:55
  • @boski, `df = rbind(a,b) df%>%group_by(Year,Month,Day,Hour)%>%summarise(Value=sum(Value))` this works fine only if I do not use `df ` before group_by means if I use `df = rbind(a,b) %>%group_by(Year,Month,Day,Hour)%>%summarise(Value=sum(Value))` then it works fine. Could you please edit your answer so that I can accept it. Thanks for your quick reply. – rjt May 09 '19 at 13:59
  • done.. but how come it is not working ? it works for me. If you mean its not saving the results it is because I didn't assign it to `df` so it would display the answer. Cheers! – boski May 09 '19 at 14:01
  • @boski, I don't know why its not working but this was an error `Error: unexpected symbol in "Table <- rbind(Table1, Table2) Table"` So I used without that `Table` before `group_by` and it works fine! Thanks a lot. – rjt May 09 '19 at 14:11
  • @rjt no worries. Your problem is that last `Table` you have there ... You should do `Table <- rbind(Table1,Table2)` and then in a new line or after a `;` , `Table <- Table%>%group_by( ...)`. Keep it in mind for the next time! – boski May 09 '19 at 14:19
  • @boski Ohh! Thanks, I'm very new to r environment so couldn't guess it. – rjt May 09 '19 at 14:22