4

This has worked for me before but now it isn't and I have spent two days tinkering with it before I ask for help here. I have two datasets, one called Access, the other CO2. Each one has four variables, two of which are common and are what I want to use to merge the two datasets. Just to play it really save, I am pasting the head() and str() outputs here:

> head(Access)                      > head(CO2)
       x     y  access                     x     y   CO2equ
1 -32.65 83.65    0.00              1 -32.65 83.65 183316.4
2 -36.85 83.55 4481.25              2 -36.85 83.55 173327.8
3 -36.75 83.55 4464.75              3 -36.75 83.55 301413.9
4 -36.65 83.55 4448.25              4 -36.65 83.55 360757.2
5 -36.55 83.55 4431.00              5 -36.55 83.55 409523.5
6 -36.45 83.55 4414.50              6 -36.45 83.55 448302.0

> str(Access)                                       
'data.frame':   2183106 obs. of  3 variables:       
 $ x     : num  -32.7 -36.8 -36.8 -36.7 -36.5 ...   
 $ y     : num  83.7 83.5 83.5 83.5 83.5 ...        
 $ access: num  0 4481 4465 4448 4431 ...           
 - attr(*, "data_types")= chr  "N" "N" "N"          

> str(CO2)
'data.frame':   2183106 obs. of  3 variables:
 $ x     : num  -32.7 -36.9 -36.8 -36.7 -36.6 ...
 $ y     : num  83.6 83.5 83.5 83.5 83.5 ...
 $ CO2equ: num  183316 173328 301414 360757 409523 ...
 - attr(*, "data_types")= chr  "N" "N" "N"

Now I am trying to versions of merge(). The first one results in an empty data.frame, the second in all rows existing twice, once for the variables from the first dataset, and the second with the variables from the second dataset:

> M1 = merge(Access, CO2, c("x","y"))
> head(M1)
[1] x      y      access CO2equ
<0 rows> (or 0-length row.names)

> M2 = merge(Access, CO2, by=c("x","y"), all=TRUE)
> length(M2$x)
[1] 4366212
> head(M2)
        x      y access CO2equ
1 -179.95 -89.95     NA      0
2 -179.95 -89.85     NA      0
3 -179.95 -89.75     NA      0
4 -179.95 -89.65     NA      0
5 -179.95 -89.55     NA      0
6 -179.95 -89.45     NA      0

Obviously, the respective x- and y-values are not recognized as being equivalent - but I do not know why. The data types are the same, the values look the same, and worst of all, I did this successfully a few months ago. Back then, I sasve the command history and now when I just copy and paste it into my R console, it does not work. I tried it in both R 2.13.0 and Revolution R Enterprise 4.3. I am reasonably sure that this is not a software bug but something trivial that I just overlooked even after spending some two days on this.

Cheers,
Jochen

Jochen
  • 41
  • 1
  • 2
  • 1
    You should add the output from `dput()` to your question as your example works for me. – Chase Sep 06 '11 at 19:06
  • 4
    I suspect that x and y have some digits that aren't being displayed. – Ari B. Friedman Sep 06 '11 at 19:29
  • 2
    Agree with gsk3. You are implicitly testing floating point numbers for equality and most likely being tripped up by FAQ 7.31: http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f – IRTFM Sep 06 '11 at 19:53

1 Answers1

3

Try round(..., 1) on both x and y before the merge.

IRTFM
  • 258,963
  • 21
  • 364
  • 487