0

For example, if the x is a matrix of two variables (Time and X,the length is len1),and y is a matrix of two variables (Time and Y, the length is len2), I just want to merge x and y, using the following code:

> x
                 Time    Value
1 2013-11-03 00:00:11 535.7680
2 2013-11-03 00:00:26 548.6214
3 2013-11-03 00:00:41 543.6477
4 2013-11-03 00:00:56 554.0778
5 2013-11-03 00:01:11 566.5635
6 2013-11-03 00:01:26 555.7684
> y
                 Time    Value
1 2013-11-03 00:00:11 455.4087
2 2013-11-03 00:00:26 457.7967
3 2013-11-03 00:00:41 455.3263
4 2013-11-03 00:00:56 461.9727
5 2013-11-03 00:01:11 460.6974
6 2013-11-03 00:01:26 466.2654

res<-merge(x,y,by="Time")
> res
                 Time  Value.x  Value.y
1 2013-11-03 00:00:11 535.7680 455.4087
2 2013-11-03 00:00:26 548.6214 457.7967
3 2013-11-03 00:00:41 543.6477 455.3263
4 2013-11-03 00:00:56 554.0778 461.9727
5 2013-11-03 00:01:11 566.5635 460.6974
6 2013-11-03 00:01:26 555.7684 466.2654

I just use the head of x and y

why the length of res is larger than len1 and len2

I just want to know how to merge the x and y by the same lag "Time", the x and y of different lag "Time" is deleted

Cheng
  • 193
  • 3
  • 10
  • Provide [reproducible data](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Tried with dummy data, and `nrow(res)` is smaller than `nrow(x)` and `nrow(y)`. – zx8754 Apr 10 '15 at 07:44
  • @DominicComtois Default for merge is `all = FALSE`. – zx8754 Apr 10 '15 at 07:50
  • Compare, for both x and y, `length(x$Time)` and `length(unique(x$Time))` - then you'll see that maybe some Times are duplicated, explaining the larger nrow of your resulting dataframe. – Jason V Apr 10 '15 at 08:40
  • @JasonV how can I just delect these duplicated time – Cheng Apr 10 '15 at 08:45
  • I don't think I can answer that for you... if the whole row is duplicated, then I don't see why not, but if you have different values on the other variable(s), then you have to figure out if you want to keep everything, and if not, which one to erase! – Jason V Apr 10 '15 at 08:49
  • But if you choose to erase rows, just `x <- x[-row.indexes,]` – Jason V Apr 10 '15 at 08:55
  • Thanks, but how can I find the duplicated row??? @JasonV – Cheng Apr 10 '15 at 08:59
  • `duplicated(x$Time)` – plannapus Apr 10 '15 at 09:04
  • ...or `which(duplicated(x[,1]))` – Jason V Apr 10 '15 at 09:07
  • To compare other variables for those duplicate Times, you can use `ind.to.delete <- which(duplicated(x[,1]));comparisons <- sort(append(ind.to.delete, ind.to.delete-1));x[comparisons,]` – Jason V Apr 10 '15 at 09:18

1 Answers1

0

From the help page of merge:

The rows in the two data frames that match on the specified columns are extracted, and joined together. If there is more than one match, all possible matches contribute one row each.

Without a reproducible example I can't say for sure, but it is likely that your Time column contains duplicated values. See for instance the following example:

A <- data.frame(a=c(1,2,3,1),b=1:4)
B <- data.frame(a=c(1,2,3,1),c=1:4)
merge(A,B,by="a")
  a b c
1 1 1 1
2 1 1 4
3 1 4 1
4 1 4 4
5 2 2 2
6 3 3 3
plannapus
  • 18,529
  • 4
  • 72
  • 94