Ok folks,
I want to join two data frames via one identifier (key as same column name, is different can be renamed, doesn't make much of a difference) in R and I have tried the normal merge function, dplyr left_join, data.table and get the same error each and every time:
all the values for the rows that DO match and get returned have NA as value for the metric (see second screenshot). To just be sure it is no problem with the data I also did a simple VLOOKUP in excel and there it works flawless, I get the proper values. Now wanting to replicate and outmating this via an R script, it seemingly works but just either returns NA values or somehow turns the values into NA's. Note: I know that the second dataframe (some 6000 observations, see also screenshot) does have a lot of non-matching rows as well, but that shouldn't be a problem, right? I also tried changing the orientation, so instead of a left join with table x with table y, I tried table y with table x.
It shouldn't make a difference, since I just want the additional rows where the identifier (column name) fits. I even tried with a full join (also without pre-selecting columns) but it results in the same problem. No error or even warning log I could rely on as well. I for the life of me can not figure out why that would happen or what I am doing wrong. I checked the columns and names, they are not factors, they have class chr, so this should not be an issue for joining (I found similar problems online where this seemed to be an issue).
Thankful for any pointers.
Have a nice evening everyone!
#import and load reports
skraart2019<-read_xlsx("skraart2019.xlsx", sheet="ArticleDownloads256", skip=11)
savedrecs<-read_xls("savedrecs.xls", skip=28)
# join reports by DOI, filter by columns you need
skrafin<- savedrecs %>%
full_join(skraart2019,by="DOI") %>%
filter(`Publication Year`=="2018") %>%
select(-(22:57),-(79:90))
# whitespace removal just in case, mutate and data.table joins
skraart2019 <-skraart2019%>% mutate(DOI= trimws(as.character(DOI))); savedrecs <- savedrecs %>% mutate(DOI = trimws(as.character(DOI))); skrafin3 <- left_join(skraart2019,savedrecs)
joined<- merge(savedrecs,skraart2019,by.x = "DOI",all.x = TRUE)
library(data.table)
skraartdt<- setDT(skraart2019)
saveddt<- setDT(savedrecs)
complete<-skraartdt[saveddt,on="DOI"]
[RStudio][1][NA values][2]
[1]: https://i.stack.imgur.com/S2EX7.png
[2]: https://i.stack.imgur.com/P23zp.png
project files
https://drive.google.com/file/d/1At9lSqs5YzbbOKeNCZ1d3K8xLngGMC98/view?usp=sharing