4

I've got two tables of different length and I need to merge them by two common columns (season and client), and fill cells with NAs when there isn't common elements. Below I'm showing a small fraction of the two original tables and the final table that I need. I've tried many things with no success.

season  client.ID   qtty
1998    13  30
1999    13  30
2000    13  29
1998    28  18
1999    28  18
2000    28  18
1998    35  21
1999    35  21
2000    35  21

season  client.ID   vessel.ID   overLength
1998    28  29  17.1
1998    28  1809    4.26
1998    28  2215    9.45
1998    28  4173    5.8
1998    28  8151    4.5
1999    28  29  17.1
1999    28  1809    4.26
1999    28  2215    9.45
1999    28  4173    5.8
1999    28  8151    4.5
2000    28  29  17.1
2000    28  1809    4.26
2000    28  2215    9.45
2000    28  4173    5.8
2000    28  8151    4.5
1998    35  36  9.91
1999    35  36  9.91
2000    35  36  9.91
1998    35  40  9.91
1999    35  40  9.91
2000    35  40  9.91


season  client.ID   vessel.ID   overLength  qtty
1998    13  NA  NA  30
1999    13  NA  NA  30
2000    13  NA  NA  29
1998    28  29  17.1    18
1998    28  1809    4.26    18
1998    28  2215    9.45    18
1998    28  4173    5.8 18
1998    28  8151    4.5 18
1999    28  29  17.1    18
1999    28  1809    4.26    18
1999    28  2215    9.45    18
1999    28  4173    5.8 18
1999    28  8151    4.5 18
2000    28  29  17.1    18
2000    28  1809    4.26    18
2000    28  2215    9.45    18
2000    28  4173    5.8 18
2000    28  8151    4.5 18
1998    35  36  9.91    21
1999    35  36  9.91    21
2000    35  36  9.91    21
1998    35  40  9.91    21
1999    35  40  9.91    21
2000    35  40  9.91    21
mnel
  • 113,303
  • 27
  • 265
  • 254
Rafael
  • 617
  • 3
  • 6
  • 22

1 Answers1

7

This a simple case for merge with all = TRUE

assuming your data are in data1 and data2

then

merge(data1, data2, all = TRUE)

should work.

If you want to specify what is being merged by (in case there are common columns that you do not want use)

 merge(data1, data2, all = TRUE, by = c('season', 'client.ID'))
mnel
  • 113,303
  • 27
  • 265
  • 254
  • I'm getting this error: 'cannot allocate vector of size 128.0 Mb' i guess it's because mu laptop isn't able to do the task. Thanks a lot anyway. Cheers, Rafael P.S. I'll try on other computer. – Rafael Oct 29 '12 at 04:53
  • Read this http://stackoverflow.com/questions/1358003/tricks-to-manage-the-available-memory-in-an-r-session, The `data.table` avoids a lot of internal copying. There are great vignettes, and the code above will work if `data1` and `data2` are `data.tables`. – mnel Oct 29 '12 at 04:56