I'm new to r and have a complicated set of data so hope my explanation is correct. I have multiple data frames I need to use to perform a series of things. Here's one example. I have three data frames. One is a list of species names and corresponding codes:
>df.sp
Species Code
Picea PI
Pinus CA
Another is a list of sites with species abundance data for different locations (dir). Unfortunately, the order of the species are different.
>df.site
Site dir total t01 t02 t03 t04
2 Total PI CA AB T
2 N 9 1 5 na na
2 AB ZI PI CA
2 S 5 2 2 1 4
3 DD EE AB YT
3 N 6 1 1 5 3
3 AB YT EE DD
3 S 5 4 3 1 1
Then I also have a data frame of traits corresponding to the species:
>df.trait
Species leaft rootl
Picea 0.01 1.2
Pinus 0.02 3.5
An example of one things I want to do is get the average value for each trait (df.trait$leaft and df.trait$rootl) for all the species per site (df.site$Site) and per site location (df.site$Site N, S). So the result would be for the first row:
Site dir leaft rootl
2 N 0.015 2.35
I hope that makes sense. It is very complicated for me to think through how to go about. I've attempted working from this post and this (and many others) but got lost. Thanks for the help. Really appreciated.
UPDATE: Here is a sample of the actual df.site (reduced) using dput:
> dput(head(df.site))
structure(list(Site = c(2L, 2L, 2L, 2L, 2L, 2L), dir = c("rep17316",
"N", "", "S", "", "SE"), total = c("Total", "9", "",
"10", "", "9"), t01 = c("PI", "4", "CA", "1", "SILLAC",
"3"), t02 = c("CXBLAN", "3", "ZIZAUR", "4", "OENPIL", "2"),
t03 = c("ZIZAPT", "1", "ECHPUR", "2", "ASCSYR", "2")), .Names = c("site", "dir", "total", "t01", "t02", "t03"), row.names = 2:7, class = "data.frame")