0

I have multiple datasets with several columns. I want to extract the columns common to all the datasets. For eg. If all the datasets have a column called "id" or "Age" i want to extract these. How can I know which columns are common to all the datasets.

Initially I thought of using the head function and then going through the values to check for similarities. But in case of several datasets this could turn out to be very tedious.

th_cy
  • 51
  • 1
  • 4

2 Answers2

5

We can use Reduce and intersect

Reduce(intersect, lapply(mget(ls(pattern = "df\\d+")), names))
#[1] "a" "c"

pattern argument in ls() should include the pattern your dataframes follow. In this example, I have considered the dataframes to be df1, df2 and df3. So it follows a pattern "df" followed by a number which I have mentioned in the pattern argument (df\\d+). You need to change this based on the pattern your dataframes have or you can also include them manually in a list if they do not have any pattern.

Reduce(intersect, lapply(list(df1, df2, df3), names))
#[1] "a" "c"

If you want to subset the common columns

list_df <- mget(ls(pattern = "df\\d+"))
common_cols <- Reduce(intersect, lapply(list_df, names))
lapply(list_df, `[`, common_cols)

#$df1
#  a  c
#1 1 11
#2 2 12
#3 3 13
#4 4 14
#5 5 15

#$df2
#  a  c
#1 1 11
#2 2 12
#3 3 13
#4 4 14
#5 5 15

#$df3
#  a  c
#1 1 11
#2 2 12
#3 3 13
#4 4 14
#5 5 15

data

df1 <- data.frame(a  = 1:5, b = 2:6, c = 11:15)
df2 <- data.frame(a  = 1:5, c = 11:15)
df3 <- data.frame(a  = 1:5, b = 2:6, c = 11:15, d = 21:25)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
0

We can use tidyverse approaches

library(tidyverse)
map(mget(paste0("df", 1:3)), names) %>%
       reduce(intersect)
#[1] "a" "c"

data

df1 <- data.frame(a  = 1:5, b = 2:6, c = 11:15)
df2 <- data.frame(a  = 1:5, c = 11:15)
df3 <- data.frame(a  = 1:5, b = 2:6, c = 11:15, d = 21:25)
akrun
  • 874,273
  • 37
  • 540
  • 662