0

I work with a large number of dataframes in R and I want to find the dataframes with the minimum and the maximum number of columns and finding the difference in their column names. However, I got stuck in turning my map_dbl results to a regular tibble.

first_df   = data.frame(matrix(rnorm(20), nrow=10))
second_df  = data.frame(matrix(rnorm(20), nrow=4))
third_df   = data.frame(matrix(rnorm(20), nrow=5))

library(dplyr)
library(purrr)
library(tibble)
library(tidyr)

# capturing all the data frames
mget(ls(pattern = "_df"))  %>% 
map_dbl(ncol)      %>% 
as_tibble() 


# expected output
# first_df   2
# second_df  5

## Finding the difference in columns
diff <- setdiff(colnames(first_df), colnames(second_df ))
Hamideh
  • 665
  • 2
  • 8
  • 20
  • Please look for previous questions, as this is a duplicate: https://stackoverflow.com/questions/40036207/tidyverse-prefered-way-to-turn-a-named-vector-into-a-data-frame-tibble – Annet Apr 30 '21 at 08:51

1 Answers1

1

You can do :

library(tidyverse)

min_max <- mget(ls(pattern = "_df")) %>%
  map_dbl(ncol) %>%
  enframe() %>%
  arrange(value) %>%
  slice(1, n())

min_max

# A tibble: 2 x 2
#  name      value
#  <chr>     <dbl>
#1 first_df      2
#2 second_df     5

setdiff(names(get(min_max$name[2])), names(get(min_max$name[1])))
#[1] "X3" "X4" "X5"
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213