0

Given a list, l, of data frame objects:

l = list()
l$`__a` <- data.frame(`__ID` = stringi::stri_rand_strings(10, 1), col = stringi::stri_rand_strings(10, 1), check.names = F )
l$`__b` <- data.frame(`__ID` = stringi::stri_rand_strings(10, 1), col = stringi::stri_rand_strings(10, 1), check.names = F )
l$`_c` <- data.frame(`__ID` = stringi::stri_rand_strings(10, 1), col = stringi::stri_rand_strings(10, 1), check.names = F )

How can I automate the following such that it works for any input list l of a similar structure?

l2 = list()
l2$`__a`$`__ID.new` <-  paste0("ABC_",l$`__a`$`__ID`)
l2$`__a`$`col.new` <-  paste0("DEF_",l$`__a`$`col`)

l2$`__b`$`__ID.new` <-  paste0("ABC_",l$`__b`$`__ID`)
l2$`__b`$`col.new` <-  paste0("DEF_",l$`__b`$`col`)

I'm looking for a solution that:

  • includes all list items starting with __ in l2
  • allows for manipulating each data frame column in a different way and adding a .new suffix to the column name
  • adds all extra columns where no manipulation is required "as is"

I've attempted using lapply and writing loops that use eval(parse( ).

Evaluate expression given as a string

Community
  • 1
  • 1
Bobby
  • 1,585
  • 3
  • 19
  • 42
  • Write a function that works on one data frame (column by column), then `lapply` it to your list of data frames. The fact that you have several data frames in a list is beside the point - if you can write a function that converts one data frame it is trivial to apply it to a list of data frames. – Gregor Thomas Oct 07 '16 at 16:36
  • Thanks, and how would I select the proper list items that start with two underscores? – Bobby Oct 07 '16 at 16:39
  • 2
    Something like `l[grep(pattern = "^__", names(l)]`. So `l2 = lapply(l[grep(pattern = "^__", names(l)], your_function)`. Selecting items and applying a function to those items can be independent of what that function is. – Gregor Thomas Oct 07 '16 at 16:48
  • Why are you using such terrible names? – Rich Scriven Oct 07 '16 at 17:18
  • @RichScriven great question. They come from a relational database and I want to manipulate the data in R and then write back to the database. – Bobby Oct 07 '16 at 17:22
  • @Gregor, Rich, I've been making some good progress on this thanks to your hints. Can I post my solution attempt below and ask for feedback? Or is that likely to get a lot of negative points for this particular question? – Bobby Oct 07 '16 at 20:36
  • If you have a working answer, please post it. If you have something that doesn't work, don't post it as an answer - isolate the part that doesn't work and maybe ask a new question about it. And **please** don't confound your multiple problems. Finding list elements that begin with `__` is one question (hopefully settled by my comment above). Applying a function to a list of data frames is one issue - hopefully not a question as the answer is `lapply` as in my comment above. Finally, pasting `"ABC"` and `"DEF"` on the front of a couple columns could be one question. – Gregor Thomas Oct 07 '16 at 20:58
  • But you shouldn't be asking "*how do I do all three of these things?*". You should work out the pasting first, then go from there. One question at a time. – Gregor Thomas Oct 07 '16 at 20:58
  • I think I understand it well enough now to split it up into multiple questions. There's one one point that you might know because I think it's similar to what you posted above. How can I subset a data frame such that I always get a data frame back? For example, this returns a factor, not a data frame: `l$`__a`[, grep(pattern = "^__", names(l$`__a`)) ]`. `l$`__a`[,c(1,2)]` however returns a data frame since it has two columns. – Bobby Oct 07 '16 at 21:23
  • 1
    See the help page `?"["`. You can (a) omit the comma, `l$\`__a\`[grep(...)]` will work because it is treating your data frame as a list, and `list[]` always returns a list. Or (b) you can add the `drop = FALSE` argument, `l$\`__a\`[, grep(...), drop = FALSE]`. I would also recommend using `[`, `[[` and strings rather than `$`, it will clean up your code letting you omit all the backticks, e.g., `l[["__a"]]` instead of `l$\`__a\``. – Gregor Thomas Oct 08 '16 at 17:32
  • @Gregor `l[["__a"]]` is great, and I also think quotes are better than backticks. But in RStudio, there's no autocomplete like there would be with `$`. I suppose though that this is a worthwhile tradeoff for maintainable code. I can still use `$` for quick exploration. – Bobby Oct 08 '16 at 21:37
  • @Gregor FYI, I've also posted a new question about `[` and autocomplete: http://stackoverflow.com/questions/40126143/code-autocompletion-with-lists-in-rstudio – Bobby Oct 19 '16 at 08:36

0 Answers0