I have a large dataset that I would like to both reshape and aggregate, and am running into some difficulties.
For example, consider the following, where quotes separate columns in database. Factors are year
, habitat
, site
, and replicate
. The fields species
and biomass
are the data:
"year" "habitat" "site" "replicate" "species" "biomass"
2010 inner a 1 sp.1 10
2010 inner a 1 sp.3 6
2010 inner a 1 sp.1 5
2010 inner a 1 sp.2 8
2010 inner a 1 sp.4 4
2010 inner a 1 sp.5 7
2010 inner a 2 sp.3 5
2010 inner a 2 sp.2 6
2010 inner a 2 sp.5 2
2010 inner a 2 sp.1 5
2010 inner a 3 sp.4 5
2010 inner a 3 sp.3 4
2010 inner a 3 sp.6 8
2010 inner a 3 sp.2 5
2010 outer b 1 sp.1 6
2010 outer b 1 sp.3 9
2010 outer b 1 sp.3 3
2010 outer b 1 sp.2 6
2010 outer b 2 sp.5 4
2010 outer b 2 sp.1 5
2010 outer b 2 sp.1 7
2010 outer b 2 sp.2 5
2010 outer b 3 sp.4 2
2010 outer b 3 sp.6 5
2010 outer b 3 sp.2 4
2010 outer b 3 sp.1 4
What I would like to do is reshape and aggregate my table while keeping all the factors. So, a species
by site
matrix, with the ability to keep factors and aggregate at the level of replicate. I've included an example picture below of what I'd like, where the data were filled with a 'sum' function using Excel pivot table. Is there a simple code for this in R using dplyr
and groups? tapply
and aggregate
?
year habitat site replicate sp.1 sp.2 sp.3 sp.4 sp.5 sp.6
2010 inner a 1 15 8 6 4 7 0
2010 inner a 3 0 5 4 5 0 8
2010 outer b 1 6 6 12 0 0 0
2010 outer b 2 12 5 0 0 4 0
2010 outer b 3 4 4 0 2 0 5