I have a data table, first_median
,which includes a column location
.
Another data table that has location
and name of the city
in it.
I want to merge them so the initial data table, first_median
, gets the city names.
The problem is that it produces NA
s for some of those. To be more clear,
the coordinate 44.03125_-123.09375
has the name Eugene
. After merging,
the first two repetition of 44.03125_-123.09375
are mapped to Eugene
, but the rest are mapped to NA
.
Next weird part is that I convert the first_median
to data frame, (as.data.frame(first_median)
,
and then back to data table, data.table(first_median)
, and then I do the merge, then it works!!!
Please take a look at the image.
Any idea what is going on?
Also, I changed the code to
first_medians_merged_before <- merge(first_medians, LOI, by="location",
all.x=T)
dput(head(first_medians_merged_before, 5))
first_medians <- as.data.frame(first_medians)
first_medians <- data.table(first_medians)
first_medians_merged_after <- merge(first_medians, LOI, by="location", all.x=T)
dput(head(first_medians_merged_after, 5))
To be more clear, and the outputs of the dput
are below:
> dput(head(first_medians_merged_before, 5))
structure(list(location = c("44.03125_-123.09375", "44.03125_-123.09375",
"44.03125_-123.09375", "44.03125_-123.09375", "44.03125_-123.09375"
), time_period = c("1950-2005", "1950-2005", "1979-2015", "1979-2015",
"2006-2025"), emission = c("RCP 4.5", "RCP 8.5", "RCP 4.5", "RCP 8.5",
"RCP 4.5"), median = c(72, 72, 68, 68, 78), city = c("Eugene",
"Eugene", NA, NA, NA)), sorted = "location", class = c("data.table",
"data.frame"), row.names = c(NA, -5L), .internal.selfref = <pointer: 0x1028114e0>)
> dput(head(first_medians_merged_after, 5))
structure(list(location = c("44.03125_-123.09375", "44.03125_-123.09375",
"44.03125_-123.09375", "44.03125_-123.09375", "44.03125_-123.09375"
), time_period = c("1950-2005", "1950-2005", "1979-2015", "1979-2015",
"2006-2025"), emission = c("RCP 4.5", "RCP 8.5", "RCP 4.5", "RCP 8.5",
"RCP 4.5"), median = c(72, 72, 68, 68, 78), city = c("Eugene",
"Eugene", "Eugene", "Eugene", "Eugene")), sorted = "location", class = c("data.table",
"data.frame"), row.names = c(NA, -5L), .internal.selfref = <pointer: 0x1028114e0>)
>