Drop a column that was used as 'by' argument in join

Question

I have the following query:

library(dplyr)
FinalQueryDplyr <- PostsWithFavorite %>%
  inner_join(Users, by = c("OwnerUserId" = "Id"), keep = FALSE) %>%
  select(DisplayName, Age, Location, FavoriteTotal, MostFavoriteQuestion, MostFavoriteQuestionLikes) %>%
  select(-c(OwnerUserId)) %>%
  arrange(desc(FavoriteTotal))

As you can see, I use the OwnerUserId column as the joining column between 2 data frames.

I want the result data frame to only have other columns, without the OwnerUserId column visible.

Even though I 'deselect' the OwnerUserId column 2 times in said query:

once by not including it in the first select clause
once by explicitly deselecting it with select(-c(OwnerUserId))

It is still visible in the result: OwnerUserId DisplayName Age Location FavoriteTotal MostFavoriteQuestion MostFavoriteQuestionLikes

How can I get rid of the column that was used as a joining column in dplyr?

Without being able to work with your data, it's pretty hard to know what's going on. A [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) would be helpful. Can't do any more than guess, but is the data frame grouped by that column? — camille, Apr 18 '20 at 17:28
Have you tried `ungroup`ing? That's the most probable reason I can think of. If a var is a grouping var, dyplr will not (de-)select it. — stefan, Apr 18 '20 at 18:05

score 1 · Accepted Answer · answered Apr 18 '20 at 19:31

One option is to remove the attribute by converting to data.frame

library(dplyr)
PostsWithFavorite %>%
   inner_join(Users, by = c("OwnerUserId" = "Id"), keep = FALSE) %>%
   select(DisplayName, Age, Location, FavoriteTotal, 
          MostFavoriteQuestion, MostFavoriteQuestionLikes) %>%
   as.data.frame %>%
   select(-c(OwnerUserId)) %>%
   arrange(desc(FavoriteTotal))

Drop a column that was used as 'by' argument in join

1 Answers1