1

I have one dataframe (df1) with more than 200 columns containing data (several thousands of rows each). Column names are alphanumeric and all distinct from each other.

I have a second dataset (df2) with a couple of columns where the first column (named 'col1') contains rows with "values" carrying colnames of df1.

But not for every row in df2 I have a corresponding column in df1.

Now I would like to delete (drop) all rows in df2 where there is no "corresponding" column in df1.

I searched quite a while using keywords like "subset data.frame by values from another data.frame" but did not find any solution. I checked, e.g. here, here or here and some other places.

Thanks for your help.

Community
  • 1
  • 1
Slyrs
  • 121
  • 1
  • 10
  • Can you create a small reproducible example? [See tips here](http://stackoverflow.com/q/5963269/903061) - use built-in data, or simulate data, or use `dput()` to share reproducibly. – Gregor Thomas Jun 08 '16 at 20:53
  • 1
    But maybe what you want is `df2[df2$col1 %in% names(df1), ]`. It doesn't seem to matter at all that `df1` is a data frame, the only thing that matters is that you have a chracter vector of values you want to keep, and that happens to be `names(df1)`. – Gregor Thomas Jun 08 '16 at 20:54
  • Thanks to @Gregor and Effel. You did the trick! – Slyrs Jun 08 '16 at 21:08

1 Answers1

3

Data:

df1 <- data.frame(a = 1:3, b = 1:3)
#   a b
# 1 1 1
# 2 2 2
# 3 3 3

df2 <- data.frame(col1 = c("a", "c"))
#   col1
# 1    a
# 2    c

Keep rows in df2 whose values are names in df1:

subset(df2, col1 %in% names(df1))
#   col1
# 1    a
effel
  • 1,421
  • 1
  • 9
  • 17