1

I want to combine two tables, using semi_join because table 2(all_drafts_adj) forms the basis to filter table 1(draft_all_stats).

draft_all_stats <- all_stats %>%
  semi_join(all_drafts_adj, by = "Player") %>%
  drop_na()

I noticed some discrepancies in the number of observations that do not match table 2 (all_drafts_adj)'s number of observations. The differences were due to the way specific "Player" was stated in table 2 vs. table 1 (e.g. table 2 "Player" was stated as "Dennis Smith" and the same Player in table 1 was stated as "Dennis Smith Jr".

I tried using the following R script, but it replaced all Player names instead of the specific observation:

all_stats$Player <- str_remove("Dennis Smith Jr", "Jr") 

Most of the transform/mutate scripts are mostly targeted at entire columns or entire observations Any on what R script to use to change specific observations with the data table?

Ian Campbell
  • 23,484
  • 14
  • 36
  • 57
xiuxiu
  • 47
  • 4
  • 1
    # Hi xiuxiu, welcome to Stack Overflow. It will be much easier to help if you provide at least a sample of your data with `dput(all_drafts_adj)` or if your data is very large `dput(all_drafts_adj[1:30,])`. Please do the same for `all_stats`. You can edit your question and paste the output. You can surround it with three backticks (```) for better formatting. See [How to make a reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for more info. – Ian Campbell May 08 '20 at 20:52
  • Thanks for the type Ian! Am learning on the fly. Appreciate the guidance and patience by the community :) – xiuxiu May 09 '20 at 09:23

1 Answers1

1

If the elements should be matched via partial match, then one option is regex_semi_join from fuzzyjoin

library(fuzzyjoin)
draft_all_stats <- all_stats %>%
     regex_semi_join(all_drafts_adj, by = "Player") %>%
     drop_na()

Or with a distance approach with stringdist

draft_all_stats <- all_stats %>%
     stringdist_semi_join(all_drafts_adj, by = "Player") %>%
     drop_na()
akrun
  • 874,273
  • 37
  • 540
  • 662