-1

I am trying to compute the fraction (%) of a each record from the group total in a dataframe. My data look like:

enter image description here

Where I have factors for Station, Month and PHylum and then a total. I would like to show the totals as relative %, so basically to sum the totals by Station and Month and then apply the original table.

In R, I got as far as:

bn_phyla %>% 
  group_by(Station, Month) %>% 
  summarise(total=sum(SumOfTotal Caught)) %>% 
  mutate(prop=SumOfTotal Caught/total)

Which gets me the group totals, but then how to I divide that back into the original data and preserve the Phylum column?

Thanks

PS..Does Stackoverflow have no way of inserting a table other than an image?

jay.sf
  • 60,139
  • 8
  • 53
  • 110
DarwinsBeard
  • 527
  • 4
  • 16
  • What do you mean "divide that back into the original data"? You can (and should) post what you want your final result to look like. Also, you can make some sample data the same way was you posted your codelike `data = tibble(group = c(A, B, C, D), amount = c(1, 2, 3, 4)` – Ben G Jun 29 '18 at 14:03
  • Insert the output from `dput(yourTable)` into your question. So we can reconstruct your data, also read https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – jogo Jun 29 '18 at 14:11

1 Answers1

1

You can do it without summarise and it will work as expected. I doubled your data example so I have 2 groups to work with to show how it works.

library(dplyr)

bn_phyla %>% 
  group_by(Station, Month) %>% 
  mutate(prop = SumOfTotal_Caught/sum(SumOfTotal_Caught))

# A tibble: 8 x 5
# Groups:   Station, Month [2]
  Station Month  Phylum     SumOfTotal_Caught  prop
  <chr>   <chr>  <chr>                  <dbl> <dbl>
1 A       Feb-18 Annelida                  20 0.182
2 A       Feb-18 Arthropoda                20 0.182
3 A       Feb-18 Mollusca                  30 0.273
4 A       Feb-18 Nemertea                  40 0.364
5 B       Mar-18 Annelida                  40 0.333
6 B       Mar-18 Arthropoda                30 0.25 
7 B       Mar-18 Mollusca                  30 0.25 
8 B       Mar-18 Nemertea                  20 0.167

data:

# data_frame comes from dplyr
bn_phyla <- data_frame(Station = c(rep("A", 4), rep("B", 4)),
                       Month = c(rep("Feb-18", 4), rep("Mar-18", 4)),
                       Phylum = c("Annelida", "Arthropoda", "Mollusca", "Nemertea", "Annelida", "Arthropoda", "Mollusca", "Nemertea"),
                       SumOfTotal_Caught = c(20,20,30,40, 40,30,30,20))
phiver
  • 23,048
  • 14
  • 44
  • 56