0

I'm fairly new to R (taking it as a grad student) and I'm unsure how to combine and sum specific data.

So right now my data looks like this

enter image description here

What I would like to do is look only at the site and each type of earthworm (Juv, Epi, Endo, Ane, Unk). And I want to group all of the site samples into one big site. So for ex. I want all my site 5 to combine together and show the sum of all the earthworms in each type of earthworm (Juvs, Epis, Endo...etc.), do this for site 27 and all my other sites.

So I'm hoping the end result would be something like

| Site | Juv | Epi |.........
| ---- | --- | --- |.........
| 5    | 15  | 2   |.........
| 27   | 6   | 0   |.........

I hope this all makes sense. Let me know if there's anymore information needed!!

My professor mentioned tapply function but we havent covered this is class and I don't understand how to apply this to my data

MrFlick
  • 195,160
  • 17
  • 277
  • 295
Heikaru
  • 1
  • 1
  • Please avoid posting images of data or code. Can you make your post [reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) and provide your data using `dput()`, as well as any code you've tried so far, even if it doesn't fully work? – jrcalabrese Feb 15 '23 at 19:01

3 Answers3

0

Base R with aggregate:

aggregate(cbind(Juv, Epi, Endo, Ane, Unk) ~ Site, data = df, FUN = function(x) sum(x, na.rm = TRUE))

With dplyr:

library(dplyr)

df %>% 
  group_by(Site) %>% 
  summarise(across(c(Juv, Epi, Endo, Ane, Unk), ~sum(., na.rm = TRUE)))

output:

 Site Juv Epi Endo Ane Unk
1   27   6   0    0   0   0
2   34  15   2    3   0   0

data without date column:

structure(list(Site = c(34L, 34L, 34L, 34L, 27L, 27L, 27L, 27L
), Sample = c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L), Stake = c(1L, 
5L, 7L, 8L, 2L, 4L, 3L, 1L), Julian_date = c(22168L, 22168L, 
22168L, 22168L, 22172L, 22172L, 22172L, 22172L), Juv = c(2L, 
4L, 5L, 4L, 2L, 2L, 2L, 0L), Epi = c(0L, 2L, 0L, 0L, 0L, 0L, 
0L, 0L), Endo = c(1L, 2L, 0L, 0L, 0L, 0L, 0L, 0L), Ane = c(0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L), Unk = c(0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L)), class = "data.frame", row.names = c(NA, -8L))
TarJae
  • 72,363
  • 6
  • 19
  • 66
0

Let say your data name is 'mydata'. Then,use the aggregate() function to sum your values by site:

result_sum <- aggregate(cbind(Juv, Epi, Endo, Ane, Unk) ~ Site, data = mydata, FUN = sum)
S-SHAAF
  • 1,863
  • 2
  • 5
  • 14
0

Using fsum

library(collapse)
fsum(df1[5:9], g = df1$Site)
   Juv Epi Endo Ane Unk
27   6   0    0   0   0
34  15   2    3   0   0
akrun
  • 874,273
  • 37
  • 540
  • 662