1

I have a list 'l' of data frames. These data frames in itself are 2-dimensional matrices. For my work, I'm required to create another list which has data frames which are a subset of the data frames from the original list.

Eg: List l1 has two data frames D1 and D2, having 10 and 12 different columns of data respectively. Now I want to create a new list l2 which also has two data frames but these data frames are columns picked out from the earlier data frames D1 and D2. Please consider that the position of the same column in D1 and D2 could be different, therefore I would have to access it through column name and not index

Could someone please suggest how I could go about implementing this?

Meraj
  • 49
  • 1
  • 1
  • 3
  • 1
    `lapply(l, )`. If you want more specific code, you need to provide a more specific description of D3 and D4 than "basically subsets of D1 and D2". – Gregor Thomas Nov 22 '17 at 20:32
  • 3
    If you want the rows 1:5 and the columns 2 and 3, you could do `lapply(l, "[", 1:5, 2:3)`, but if you have conditions or something an example would go a long way. – Gregor Thomas Nov 22 '17 at 20:34
  • I want to extract specific columns from D1 and D2 – Meraj Nov 22 '17 at 20:34
  • Put it in your question! Make your question "How would I extract columns named `"X"` and `"MyFavoriteColumn"`?" or "How would I extract the 2nd, 4th, and 321st column?" or something like that. – Gregor Thomas Nov 22 '17 at 20:35
  • Sure will add it – Meraj Nov 22 '17 at 20:41
  • 2
    Its easier to help you if you provide a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input data and the desired output data. That way possible solutions can be tested and verified. – MrFlick Nov 22 '17 at 20:44

4 Answers4

26

Here's an example (this is the kind of thing you should have put in your question. You will get near-instantaneous help if you can structure your question with a clear, copy/pasteable, reproducible example like this.)

Problem:

# list of data frames:
l = list(mtcars, mtcars)

# vector of column names I would like to extract
my_names = c("mpg", "wt", "am")
# these columns might be at different positions in the data frames

Solution:

result = lapply(l, "[", , my_names)

# look at the top 6 rows of each to verify that it worked:
lapply(result, head)
# [[1]]
#                    mpg    wt am
# Mazda RX4         21.0 2.620  1
# Mazda RX4 Wag     21.0 2.875  1
# Datsun 710        22.8 2.320  1
# Hornet 4 Drive    21.4 3.215  0
# Hornet Sportabout 18.7 3.440  0
# Valiant           18.1 3.460  0
#
# [[2]]
#                    mpg    wt am
# Mazda RX4         21.0 2.620  1
# Mazda RX4 Wag     21.0 2.875  1
# Datsun 710        22.8 2.320  1
# Hornet 4 Drive    21.4 3.215  0
# Hornet Sportabout 18.7 3.440  0
# Valiant           18.1 3.460  0

Explanation: You essentially want to do l[[1]][, my_names], l[[2]][, my_names], ... lapply applies a function to every list element. In this case, the function is [, which takes rows as its first argument (we leave it blank to indicate all rows), columns as its second argument (we give it my_names). It returns the results in a list.

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
5

You can use dplyr, it is nice, easy and the syntax is clear:

    library(dplyr)
    l <- list(mtcars, mtcars) # the list of 2 df
    new_list <- lapply(l, function(x) x%>% select(mpg,wt,am))

Ciao!

theLudo
  • 127
  • 4
1

A purrr solution:

library(purrr)
library(dplyr)
map(l, ~ .x |> select(all_of(my_names)))
Julian
  • 6,586
  • 2
  • 9
  • 33
0

I had a list of 21 columns and out of which I wanted to extact and create a separate list with columns from 1 to 7, 11 and 21. This is what worked for me.

mydata <- read.csv("data.csv")
newdatalist <- data[c(1:7, 11, 21)]
Sujoy
  • 802
  • 11
  • 22