How can I select the column using the column name in a different dataframe?

Question

I want to select the column(or make a subset) using the column name in a different dataframe.

e.g.

A dataframe: 3 columns

each column name: ab, cd, de

B dataframe: 10 columns

each column name: n_ab, n_cd_e, n_de, ab, fg, n_ef, tt, yy, zz, n_a2

I want to make the subset of the B dataframe.

subset C dataframe n_ab, n_cd_e, n_de, ab
subset D dataframe ab

How can I make C and D dataframe?

I expected that I could make the subset B using this code. but, I couldn't. Because the contains() only can make the subset by letter.

3) How can select the column(or make the subset) using the condition(like >= , %in% , == etc.)?

ge<-select(ge.n, contains('ge'))

Thanks

welcome to `stackoverflow`, please edit your question to include your code and sample in proper format. — Ed_Gravy, Aug 03 '22 at 12:24
Please take a read of [this post](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) on how to ask a good question. Currently it's unclear what you're asking. — user438383, Aug 03 '22 at 12:26

score 1 · Answer 1 · answered Aug 03 '22 at 12:42

To create C you can use grepl with an OR pattern derived from the elements on A's names.

C = B[, grepl(paste0(names(A), collapse="|"), names(B)), drop=F]

To create D you can use %in% directly.

D = B[, names(B) %in% names(A), drop=F]

Outputs (C and D, respectively):

        n_ab     n_cd_e       n_de         ab
1 -0.4456620  0.4007715  1.7869131  0.7013559
2  1.2240818  0.1106827  0.4978505 -0.4727914
3  0.3598138 -0.5558411 -1.9666172 -1.0678237


          ab
1  0.7013559
2 -0.4727914
3 -1.0678237

Inputs:

set.seed(123)
A = setNames(as.data.frame(
  replicate(3,rnorm(3))),  c("ab","cd","de")
)
B = setNames(as.data.frame(
  replicate(10,rnorm(3))),  c("n_ab", "n_cd_e", "n_de", "ab", "fg", "n_ef", "tt", "yy", "zz", "n_a2")
)

How can I select the column using the column name in a different dataframe?

1 Answers1