0

I am a new R user and I meet problems with my code. I have 16 different dataframes and I would like to apply the same function for each dataframe. Then, I want to put all the result in the new dataframe. I wrote this code and it works well :

    df2012<-as.data.frame(cprop(wtd.table(database2012$year,database2012$nivvie_dec,weights=database2012$wprm),total=FALSE))
    df2012$annee<-"2012"
    df2011<-as.data.frame(cprop(wtd.table(database2011$year,database2011$nivvie_dec,weights=database2011$wprm),total=FALSE))
    df2011$annee<-"2011"
    df2010<-as.data.frame(cprop(wtd.table(database2010$year,database2010$nivvie_dec,weights=database2010$wprm),total=FALSE))
    df2010$annee<-"2010"
    df2009<-as.data.frame(cprop(wtd.table(database2009$year,database2009$nivvie_dec,weights=database2009$wprm),total=FALSE))
    df2009$annee<-"2009"
    df2008<-as.data.frame(cprop(wtd.table(database2008$year,database2008$nivvie_dec,weights=database2008$wprm),total=FALSE))
    df2008$annee<-"2008"
    df2007<-as.data.frame(cprop(wtd.table(database2007$year,database2007$nivvie_dec,weights=database2007$wprm),total=FALSE))
    df2007$annee<-"2007"
    df2006<-as.data.frame(cprop(wtd.table(database2006$year,database2006$nivvie_dec,weights=database2006$wprm),total=FALSE))
    df2006$annee<-"2006"
    df2005<-as.data.frame(cprop(wtd.table(database2005$year,database2005$nivvie_dec,weights=database2005$wprm),total=FALSE))
    df2005$annee<-"2005"
    df2004<-as.data.frame(cprop(wtd.table(database2004$year,database2004$nivvie_dec,weights=database2004$wprm),total=FALSE))
    df2004$annee<-"2004"
    df2003<-as.data.frame(cprop(wtd.table(database2003$year,database2003$nivvie_dec,weights=database2003$wprm),total=FALSE))
    df2003$annee<-"2003"
    df2002<-as.data.frame(cprop(wtd.table(database2002$year,database2002$nivvie_dec,weights=database2002$wprm),total=FALSE))
    df2002$annee<-"2002"
    df2001<-as.data.frame(cprop(wtd.table(database2001$year,database2001$nivvie_dec,weights=database2001$wprm),total=FALSE))
    df2001$annee<-"2001"
    df2000<-as.data.frame(cprop(wtd.table(database2000$year,database2000$nivvie_dec,weights=database2000$wprm),total=FALSE))
    df2000$annee<-"2000"
    df1999<-as.data.frame(cprop(wtd.table(database1999$year,database1999$nivvie_dec,weights=database1999$wprm),total=FALSE))
    df1999$annee<-"1999"
    df1998<-as.data.frame(cprop(wtd.table(database1998$year,database1998$nivvie_dec,weights=database1998$wprm),total=FALSE))
    df1998$annee<-"1998"
    df1997<-as.data.frame(cprop(wtd.table(database1997$year,database1997$nivvie_dec,weights=database1997$wprm),total=FALSE))
    df1997$annee<-"1997"
    df1996<-as.data.frame(cprop(wtd.table(database1996$year,database1996$nivvie_dec,weights=database1996$wprm),total=FALSE))
    df1996$annee<-"1997"
    df19962012<-rbind(df1996,df1997,df1998,df1999,df2000,df2001,df2002,df2003,df2004,df2005,df2006,df2007,df2008,df2009,df2010,df2011,df2012)

However, it is a long code and I need to replicate for others variables like sex, educational levels and family structure instead of year... I looked for a shorter code using lapply, but all my tentatives failed. Someone knows a way to shorten the code ?

Thank you very much for your help !

David Marguerit
  • 197
  • 2
  • 14
  • 1
    This is not a [reproducible example](http://stackoverflow.com/q/5963269/2572423) so it will be difficult for others to help you. Perhaps you can create some fake data sets / data.frames? You will likely want to write a simple function. – JasonAizkalns Oct 29 '15 at 14:59
  • i'm sure you want `df1996$annee<-"1996"` and not `df1996$annee<-"1997"` Where the 17 databases are comming from? (into R) – jogo Oct 29 '15 at 15:31

3 Answers3

2

Again, see my comment to generate a new example, but the following should get at the core elements of your question and is reproducible. Walk through each portion slowly to understand what's going on. In general, you should strive for D.R.Y. code when possible and get in the habit of writing small/simple functions anytime you find yourself repeating lines of code:

Make two "fake" data.frames:

df1 <- data.frame(x = 1:10)
df2 <- data.frame(x = 11:20)

A simple "dummy" function h(x), rather, h(df), takes a data.frame and creates a new column y by taking the dataframe's existing x column and adding 10.

h <- function(df) {
  df$y <- df$x + 10
  df
}

Find all the objects of the pattern df-any-number and store them in dfs:

dfs <- ls(pattern = "df[0-9]")
dfs

Run lapply over dfs by searching by name (i.e. mget) and apply function h to each of them. Finally, rbind the results via do.call.

do.call(rbind, lapply(mget(dfs), h))

#         x  y
# df1.1   1 11
# df1.2   2 12
# df1.3   3 13
# df1.4   4 14
# df1.5   5 15
# df1.6   6 16
# df1.7   7 17
# df1.8   8 18
# df1.9   9 19
# df1.10 10 20
# df2.1  11 21
# df2.2  12 22
# df2.3  13 23
# df2.4  14 24
# df2.5  15 25
# df2.6  16 26
# df2.7  17 27
# df2.8  18 28
# df2.9  19 29
# df2.10 20 30

Some posts that will be helpful to guide your understanding:

Community
  • 1
  • 1
JasonAizkalns
  • 20,243
  • 8
  • 57
  • 116
1

for a list of Dataframes:

yDF <- function(y) {
  db <- get(paste0("database", y))
  df <- as.data.frame(cprop(wtd.table(db$year,db$nivvie_dec,weights=db$wprm),total=FALSE))
  df$annee <- y
  df
}
years <- 1996:2012
L <- lapply(years, yDF)

... normaly I am not a friend of get(). you also can do rbind() for a long dataframe:

DF <- yDF(1996)
for (y in 1997:2012) DF <- rbind(DF, yDF(y))
jogo
  • 12,469
  • 11
  • 37
  • 42
0

You can do something like complete_dataframe <- rbind(...) to combine all your data frames together, especially if they have a separate column that defines each dataframe (here it will be annee). Then you can use either the data.table package or dplyr package to apply a function over specific groups.

In dplyr, the workflow would be

complete_dataframe %>% group_by(annee) %>% mutate(new_var = somefunction(columns_to_pass_into_function))

to generate new variables, or

complete_dataframe %>% group_by(annee) %>% summarise(new_var = somefunction(columns_to_pass_into_function))

to create a summary table over the groups.

Allen Wang
  • 2,426
  • 2
  • 24
  • 48