0

I'm new on R (and I use R-studio) and I have to analyze a big data frame (60 variables for 10 000 observations). My data frame had a column name specie with lot of different animals species in there. The goal of my work it's to have results of 8 differents species, so I have to work on there separately. I start with building different subset (like I learn in school) and with awesome packages(special thanks to dplyr & tdyr). But now I have to repeat many identical (or nearly identical) actions on each of the 8 species, so I spent much time to copy/paste and when I make a mistake I must verify and change mistakes on thousands of lines. Then I try to learn about loops et apply family functions. But I can't do something good.


There is an exemple of an action I do on a specie with the traditional way (organize data):

espece_td_a <- subset(BDD, BDD$espece == "espece A" & BDD$placette =="TOTAL")%>%
  select(code_site,passage,adulte)%>% 
  spread(passage, adulte)              
  espece_td_a <- full_join(B.irene_td_a, BDD_P3_TOT_site)
  espece_td_a <- replace(espece_td_a, is.na(espece_td_a),0)  
  espece_td_a$P1[B.irene_td_a$P1>0]<-1                       
  espece_td_a$P2[B.irene_td_a$P2>0]<-1
  espece_td_a$P3[B.irene_td_a$P3>0]<-1
  write.csv(espece_td_a, file = "espece_td_a.csv")

BDD is my data frame.

BDD_P3_TOT_site is vector (or data frame with 1 columns and many rows ?) built with BDD

This "traditional way" work for me, but I must do something like that so many times! And it takes a lot of time...

Then I tried to "apply" this with function :

f <- function(x)
{
  select(code_site, passage, adulte)%>%
    spread(x, x$passage, x$adulte)%>%
    full_join(x, BDD_P3_TOT_site) -> x
  x <- replace(x, is.na(x),0)
  x$P1[x$P1>0]<-1
  x$P2[x$P2>0]<-1
  x$P3[x$P3>0]<-1
}

I wish apply this function to my dataset with lapply (with my 8 species in list):

l <- c("espece_a","espece_b","espece_c")

lapply(l,f(x))

Problems :

  • I know that is a wrong formulation for lapply if I want take my species into BDD.

  • the function doesn't want work: I already made 8 subsets (for each of my interest species) In my global environment: espece_a; espece_b...

Then I wanted to put my subset one by one into my function:

> f(espece_a)

Error in select_(.data, .dots = lazyeval::lazy_dots(...)) : Show Traceback object 'code_site' not found Rerun with Debug


I wish that my table appears in my Globlal env with a name that make me able to recognize it (ex: "espece_td_a")

Sotos
  • 51,121
  • 6
  • 32
  • 66
  • Geia sou Theodore :). Welcome to SO. Please go through [this link](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) and create a sample data set for us to play with. Also, add your expected output. – Sotos Jul 04 '17 at 16:00
  • Ok, after a first read of your post, It seems like you need to study `dplyr` syntax. Quite a few things wrong there. Also in order to make functions using `dplyr` you need to read on standard / Non-Standard Evaluation. Also, I think you are really over complicating things. If you provide a sample we should be able to help you – Sotos Jul 04 '17 at 16:09

2 Answers2

0

You have 3 issues relating to your use of lapply:

  1. You need to return the object x at the end of the f function:
  2. l should be a list of dataframes not just a vector of dataframe names, i.e. l <- list(espece_a,espece_b,espece_c)
  3. When using lapply with an existing function, you only need to pass the name of the function, i.e. lapply(l,f)

Hopefully this should solve your problem.

LucyMLi
  • 657
  • 4
  • 14
  • I solve the function problem, if I apply the function on one of my subset, it change it. But when I use lapply I just have a list with the name of my subset, I want all my subset in the list change by my function. – Théodorus Kyritsos Jul 05 '17 at 12:09
0

I solve the function problem :

f <- function(X){
  X <- select(X, code_site, passage, adulte)%>%
   spread(passage, adulte) 
  X <- full_join(X, BDD_P3_TOT_site)
  X <- replace(X, is.na(X),0)
  X$P1[X$P1>0]<-1
  X$P2[X$P2>0]<-1
  X$P3[X$P3>0]<-1
  X <- return(X)
    }

test <- f(espece_a)