0

This is the code I am currently using to move data from multiple data frames into a time-ordered vector which I then perform analysis on and graph:

TotalLoans <- c(
  sum(as.numeric(HCD2001$loans_all)), sum(as.numeric(HCD2002$loans_all)),
  sum(as.numeric(HCD2003$loans_all)), sum(as.numeric(HCD2004$loans_all)),
  sum(as.numeric(HCD2005$loans_all)), sum(as.numeric(HCD2006$loans_all)),
  sum(as.numeric(HCD2007$loans_all)), sum(as.numeric(HCD2008$loans_all)),
  sum(as.numeric(HCD2009$loans_all)), sum(as.numeric(HCD2010$loans_all)),
  sum(as.numeric(HCD2011$loans_all)), sum(as.numeric(HCD2012$loans_all)),
  sum(as.numeric(HCD2013$loans_all)), sum(as.numeric(HCD2014$loans_all)),
  sum(as.numeric(HCD2015$loans_all)), sum(as.numeric(HCD2016$loans_all))
)

I do this four more times with similar data frames that also are similarly formatted as:

Varname$year

Is there a way to loop through these 16 data frames, select an individual column, perform a function on it, and put it into a vector? This is what I have tried so far:

AllList <- list(HCD2001, HCD2002, HCD2003, HCD2004, HCD2005, HCD2006, HCD2007, HCD2008, HCD2009, HCD2010, HCD2011, HCD2012, HCD2013, HCD2014, HCD2015, HCD2016)    

TotalLoans <- lapply(AllList,
                function(df){
                  sum(as.numeric(df$loans_all))
                  return(df)
                }
              )

However, it returns a Large List with every column from the data frames. All the other posts related to this were for modifying data frames, not creating a new vector with modified values of the data frames.

FRY-9C
  • 51
  • 6
  • 1
    Try removing `return(df)` to return the sum of *loans_all* in all dfs. Is that what you want? – Parfait Jun 14 '17 at 15:54
  • Also you may want to use `sapply` instead of `lapply` – amatsuo_net Jun 14 '17 at 15:57
  • More broadly, you might find it much simpler to move all that important data out of the variable name, where it's largely useless: `bind_rows(setNames(AllList,paste0("HCD",2001:2016)),.id = "year")` and then do operations grouped by year (this is using **dplyr**, obviously). – joran Jun 14 '17 at 15:57
  • Removing return(df) and using sapply worked perfectly. Thank you both so much. Now to figure out why. – FRY-9C Jun 14 '17 at 16:54

0 Answers0