2

Suppose i have some dataframes A_January, A_February, A_December etc with some 10 columns each...

All of them have the same 10 columns.. I need to do some data manipulation on one of the 10 columns and produce a new bunch of columns in each of the data frames.. I can do this manually for all dataframes, but i have 400 such dataframes..

How do i do this?. please let me know... Suppose, i need to do the same set of operations on multiple dataframes...(create new variables, sort them etc etc) A_January$New_var<-A_January$Var1+A_January$Var2

How do i do this?. How can i put this in a loop and make it happen? PLease let me know

A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485

1 Answers1

1

First step is very important: do not create a variable for each data.frame. Instead, put them all into a list of data.frames:

data <- list(A_January, A_February, A_December)

This might look cumbersome to type, especially if you have hundreds of data.frames. So if you can tell us how you came to create these data.frames we might help fix the problem at the root.

Once you have a list, it is very easy to modify all of them:

data <- lapply(data, transform, New_var = Var1 + Var2)
flodel
  • 87,577
  • 21
  • 185
  • 223
  • Or if the object names have a common pattern using `mget`. – Simon O'Hanlon Aug 28 '13 at 10:27
  • `mget` will help fix the problem mid-way, not at the root... Really, you shouldn't have hundreds of similar variables in your environment. – flodel Aug 28 '13 at 10:28
  • Cool flodel, thanks for this!.. Believe i can use lists to do what i need... Quick question...can i do multiple transformations in one command or do i need to do it in multiple lines..Suppose i want to sort all datasets by a specific var1 and var2 and then i want to do something else as well... –  Aug 28 '13 at 10:31
  • You can pass any function to `lapply`, in particular, one written by yourself. Typically, you would 1) write a function that takes a data.frame as input and transforms it as you wish, 2) test it on the first data frame in the list, and 3) if you are happy with the result, apply it to all your data through `lapply`. – flodel Aug 28 '13 at 10:46
  • hi flodel, if i need to sort my data first then will this work:??Data_sorted <- lapply(df, transform,with(df, df[order(var1,var2,var3),]) ) Do let me know if this a noob question :D –  Aug 28 '13 at 12:46
  • I don't think so. Why are you mixing `transform`, `with` and `lapply`? Try `Data_sorted <- lapply(Data, function(df) df[order(df$var1, df$var2, df$var3),])`. – Roland Aug 28 '13 at 13:04
  • yeah, i am a bit mixed up today.. Thanks for the awesome guidance and direction Roland, again :). Things are starting to become much clearer than when i started.. thanksa lot to everyone!...excellent guidance and feedback on my approach –  Aug 28 '13 at 13:09