0

I have two data frames (bwenv and bwsp). bwsp is a subset of bwenv and they have matching rownames (sample id). I would like to subset bwenv so that it only includes the rows that are also found in bwsp.

When the number of rows match, I have used:

bw2015 <- cbind(bwenv, bwsp)

to create a new dataframe with the combined data.

My question is very similar to the question asked here: R subset a column in data frame based on another data frame/list, but the subsetting is done by a column of data in each dataframe (rather than row names like I want to do).

Community
  • 1
  • 1
ayesha
  • 135
  • 15

2 Answers2

0
library(dplyr)

bw2015 <- bwenv %>% 
  add_rownames("row_names") %>%
  semi_join(add_rownames(bwsp, "row_names"), by = "row_names")
yeedle
  • 4,918
  • 1
  • 22
  • 22
  • Thank you! After doing this, I can reassign the first column back into row names. – ayesha May 09 '17 at 20:26
  • yes. `rownames(bw2015) <- bw2015$row_names` and then `bw2015 <- bw2015 %>% select(-row_names)` – yeedle May 09 '17 at 20:30
  • Oh, I think I spoke too soon after your initial solution. I'm getting an error message that says: `Warning message: Deprecated, use tibble::rownames_to_column() instead. ` – ayesha May 09 '17 at 21:08
0

Following on from @yeedle's solution, I modified it a little and found this worked for me:

library(dplyr)
bwenv2 <- bwenv %>% 
  rownames_to_column("row_names") %>%
  semi_join(rownames_to_column(bwsp, "row_names"), by = "row_names")
rownames(bwenv2) <- bwenv2$row_names 
bwenv2 <- bwenv2 %>% select(-row_names)

bw2015 <- cbind(bwenv2, bwsp)
str(bw2015)
ayesha
  • 135
  • 15