Find unique variable intersect between list of data frames

Question

I am faced with a problem trying to find the variables in common between all dataframes in a list.

I found this link. But it does not answer my problem because they are only using the colname to do the comparison whereas I am interested in the variables within a column.

To begin, I have a list, of data frames, lets call it list1.

>list1

[[1]]
  V1 V2 V3
  1  "a" 1  
  2  "b" 9  
  3  "c" 3  

[[2]]
  V1 V2 V3
  1  "c" 5
  2  "d" 4
  3  "e" 6  
#and so on..... for 22 times

Now, I want to output an array of all the list1[[i]]$V2 variables that are in common between all the dataframes. So, If the remaining 20 dataframes all look like list1[[2]], then the output should be c; because it would be the only common V2 variable between all the dataframes.

I have tried using do.call("rbind", list1) and using dplyr to find the common V2s but I can't seem to figure it out. Also, I know intersect() can be used in this instance, but using intersect(intersect(intersect.... Seems like a very inefficient approach to the problem and I want to do this operation on other lists as well. Any help would be much appreciated.

Thank you very much,

-Omar.

You need to post code that creates a valid example and say what is a correct answer. Read [MCVE] and the more R-specific SO question: "How to create a great reproducible example in R". Unfortunately your problem description seems rather unclear. — IRTFM, Dec 20 '18 at 17:11

Andrew Gustar · Accepted Answer · 2018-12-20T17:27:07.213

Here is a tidyverse solution, using purrr::map and purrr::reduce...

library(tidyverse)
set.seed(999)
#first generate some data
dflist <- map(1:3,~tibble(V1=sample(letters[1:5],3),V2=sample(1:5,3)))

dflist
[[1]]
# A tibble: 3 x 2
  V1       V2
  <chr> <int>
1 b         5
2 c         4
3 a         1

[[2]]
# A tibble: 3 x 2
  V1       V2
  <chr> <int>
1 d         4
2 a         2
3 b         5

[[3]]
# A tibble: 3 x 2
  V1       V2
  <chr> <int>
1 a         4
2 d         1
3 e         3

#then...

map(dflist, ~.$V1) %>% #create a list just of the column of interest
    reduce(intersect)  #apply the intersect function cumulatively to the list

[1] "a"

Find unique variable intersect between list of data frames

1 Answers1