My problem is: I want to keep the observation column from data frame y when I join the two. In order to reference it back to the the original data frame. right now when I perform a left_join() I get null values for the observations. The column in data fame y is named "Obs"
I have already tried using different types of join or rearranging the x and y data frames
Simple Example of what I am trying to do:
x = data.frame(fun =c("cool", "neat" , "awesome", "neat1", "amazing", "sweet"), address = c("100", "1100", "99", "900", "55", "200"), state = c("IL", "CO", "MO", "CA", "MA", "TX"), date = c(12,3,4, 6, 8, 9))
y = data.frame(fun =c("cool", "neat" , "awesome", "super"), address = c("100", "1100", "99","55"), state = c("IL", "CO", "MO", "MA"), status = c(T,F,T, T))
y$Obs = 1:nrow(y)
x %>% left_join(y, by =c("address", "state"))
For some reason the above sample code works with showing the observations however, when I run this on my actual data sets with data frame x having about 18000 records and data frame y having 2100 records. I get all NA values for the observations. Even though they are matching based on state
and address
.
Expected is I have the new joined data frame with a observation column that are referenced back(the same) to data frame y. When I run it I get all NA values for Obs