I'm new on R (and I use R-studio) and I have to analyze a big data frame (60 variables for 10 000 observations). My data frame had a column name specie with lot of different animals species in there. The goal of my work it's to have results of 8 differents species, so I have to work on there separately. I start with building different subset (like I learn in school) and with awesome packages(special thanks to dplyr & tdyr). But now I have to repeat many identical (or nearly identical) actions on each of the 8 species, so I spent much time to copy/paste and when I make a mistake I must verify and change mistakes on thousands of lines. Then I try to learn about loops et apply family functions. But I can't do something good.
There is an exemple of an action I do on a specie with the traditional way (organize data):
espece_td_a <- subset(BDD, BDD$espece == "espece A" & BDD$placette =="TOTAL")%>%
select(code_site,passage,adulte)%>%
spread(passage, adulte)
espece_td_a <- full_join(B.irene_td_a, BDD_P3_TOT_site)
espece_td_a <- replace(espece_td_a, is.na(espece_td_a),0)
espece_td_a$P1[B.irene_td_a$P1>0]<-1
espece_td_a$P2[B.irene_td_a$P2>0]<-1
espece_td_a$P3[B.irene_td_a$P3>0]<-1
write.csv(espece_td_a, file = "espece_td_a.csv")
BDD is my data frame.
BDD_P3_TOT_site is vector (or data frame with 1 columns and many rows ?) built with BDD
This "traditional way" work for me, but I must do something like that so many times! And it takes a lot of time...
Then I tried to "apply" this with function :
f <- function(x)
{
select(code_site, passage, adulte)%>%
spread(x, x$passage, x$adulte)%>%
full_join(x, BDD_P3_TOT_site) -> x
x <- replace(x, is.na(x),0)
x$P1[x$P1>0]<-1
x$P2[x$P2>0]<-1
x$P3[x$P3>0]<-1
}
I wish apply this function to my dataset with lapply (with my 8 species in list):
l <- c("espece_a","espece_b","espece_c")
lapply(l,f(x))
Problems :
I know that is a wrong formulation for lapply if I want take my species into BDD.
the function doesn't want work: I already made 8 subsets (for each of my interest species) In my global environment: espece_a; espece_b...
Then I wanted to put my subset one by one into my function:
> f(espece_a)
Error in select_(.data, .dots = lazyeval::lazy_dots(...)) : Show Traceback object 'code_site' not found Rerun with Debug
I wish that my table appears in my Globlal env with a name that make me able to recognize it (ex: "espece_td_a")