I am trying to combine rows of data based on levels of other variables I have attached a sample of my data below.
data <- structure(list(FishID = c("SSS012", "SSS012", "SSS012", "SSS014",
"SSS014", "SSS014", "SSS24", "SSS24", "SSS24", "SSS24", "SSS24"
), Taxa = c("Krill", "Onisimus", "Onisimus", "Krill", "Krill",
"Onisimus", "Copepods", "Onisimus", "Themisto", "Unidentified Fish",
"Unidentified Fish"), EstimatedNumber = c(2L, 6L, 1L, 2L, NA,
6L, 16L, 4L, 389L, 80L, 1L), TotalMass = c(0.074, 0.143, 0.052,
0.034, 5.342, 0.16, 0.09, 0.087, 28.742, 6.556, 0.782), Comments = c("",
"", "", "", "", "", "", "", "", "", "will likely change taxa to fish"
), year = c(2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2019L,
2019L, 2019L, 2019L, 2019L), PA = c(1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1)), row.names = c(487L, 488L, 489L, 512L, 513L, 514L, 628L,
634L, 636L, 638L, 639L), class = "data.frame")
If we run
table(data$FishID, data$Taxa)
we can see that some taxa occur twice, while the other Taxa only occur once. I would like to make sure that each taxa only appears once per FishID. However, I would like to conserve the estimated number and total mass data from both rows (i.e., for FishID SSS012, I want one row for Onisimus with a value of 7 for estimated number and 0.095 for total mass in addition to the row for krill).