I have two data.frame(panel data) down below
A <- data.table("ID"=rep(c("OO1", "002", "003", "004", "005"), each=3),
"DATE"=rep(c(200802, 200803, 200804), time=5),
"Weight"=rep(c(60, 55, 80, 73, 62), each=3)); A
ID DATE Weight
1: OO1 200802 60
2: OO1 200803 60
3: OO1 200804 60
4: 002 200802 55
5: 002 200803 55
6: 002 200804 55
7: 003 200802 80
8: 003 200803 80
9: 003 200804 80
10: 004 200802 73
11: 004 200803 73
12: 004 200804 73
13: 005 200802 62
14: 005 200803 62
15: 005 200804 62
and
B <- data.table("ID"=rep(c("002", "003", "004", "007", "008"), each=3),
"DATE"=rep(c(200806, 200807, 200808), time=5),
"EXERCISE"=rep(c("sleep", "studying",
"movie", "jogging", "basketball"), each=3)); B
ID DATE EXERCISE
1: 002 200806 sleep
2: 002 200807 sleep
3: 002 200808 sleep
4: 003 200806 studying
5: 003 200807 studying
6: 003 200808 studying
7: 004 200806 movie
8: 004 200807 movie
9: 004 200808 movie
10: 007 200806 jogging
11: 007 200807 jogging
12: 007 200808 jogging
13: 008 200806 basketball
14: 008 200807 basketball
15: 008 200808 basketball
I try to merge these two data.frame, and I only want to reserve the information of data.frame A,that is ID=001, 002, 003, 004, 005. So, what I expected the data is
ID DATE Weight EXERCISE
1 OO1 200802 60 <NA>
2 OO1 200803 60 <NA>
3 OO1 200804 60 <NA>
4 002 200802 55 <NA>
5 002 200803 55 <NA>
6 002 200804 55 <NA>
7 002 200806 NA sleep
8 002 200807 NA sleep
9 002 200808 NA sleep
10 003 200802 80 <NA>
11 003 200803 80 <NA>
12 003 200804 80 <NA>
13 003 200806 NA studying
14 003 200807 NA studying
15 003 200808 NA studying
16 004 200802 73 <NA>
17 004 200803 73 <NA>
18 004 200804 73 <NA>
19 004 200806 NA movie
20 004 200807 NA movie
21 004 200808 NA movie
22 005 200802 62 <NA>
23 005 200803 62 <NA>
24 005 200804 62 <NA>
but my code's result is like
C <- merge(A, B, by="ID", all.x=T); C
ID DATE.x Weight DATE.y EXERCISE
1 002 200802 55 200806 sleep
2 002 200802 55 200807 sleep
3 002 200802 55 200808 sleep
4 002 200803 55 200806 sleep
5 002 200803 55 200807 sleep
6 002 200803 55 200808 sleep
7 002 200804 55 200806 sleep
8 002 200804 55 200807 sleep
9 002 200804 55 200808 sleep
10 003 200803 80 200806 studying
11 003 200803 80 200807 studying
12 003 200803 80 200808 studying
13 003 200802 80 200806 studying
14 003 200802 80 200807 studying
15 003 200802 80 200808 studying
16 003 200804 80 200806 studying
17 003 200804 80 200807 studying
18 003 200804 80 200808 studying
19 004 200804 73 200807 movie
20 004 200804 73 200808 movie
21 004 200804 73 200806 movie
22 004 200802 73 200807 movie
23 004 200802 73 200808 movie
24 004 200802 73 200806 movie
25 004 200803 73 200807 movie
26 004 200803 73 200808 movie
27 004 200803 73 200806 movie
28 005 200802 62 NA <NA>
29 005 200803 62 NA <NA>
30 005 200804 62 NA <NA>
31 OO1 200802 60 NA <NA>
32 OO1 200803 60 NA <NA>
33 OO1 200804 60 NA <NA>
and I try another way, no [by="ID"],
> C <- merge(A, B, all.x=T, all.y=T); C
ID DATE Weight EXERCISE
1 002 200802 55 <NA>
2 002 200803 55 <NA>
3 002 200804 55 <NA>
4 002 200806 NA sleep
5 002 200807 NA sleep
6 002 200808 NA sleep
7 003 200802 80 <NA>
8 003 200803 80 <NA>
9 003 200804 80 <NA>
10 003 200806 NA studying
11 003 200807 NA studying
12 003 200808 NA studying
13 004 200802 73 <NA>
14 004 200803 73 <NA>
15 004 200804 73 <NA>
16 004 200806 NA movie
17 004 200807 NA movie
18 004 200808 NA movie
19 005 200802 62 <NA>
20 005 200803 62 <NA>
21 005 200804 62 <NA>
22 007 200806 NA jogging
23 007 200807 NA jogging
24 007 200808 NA jogging
25 008 200806 NA basketball
26 008 200807 NA basketball
27 008 200808 NA basketball
28 OO1 200802 60 <NA>
29 OO1 200803 60 <NA>
30 OO1 200804 60 <NA>
it's close to what I expected, but didn't reserve only A's ID, so I'm curios about what is the difference between merge and merge, by in R, and how to fix it?