1

Possible Duplicate:
how to use merge() to update a table in R

What is the proper use of merge for this kind of operation in R? See below.

older <- data.frame(Member=c("first","second","third","fourth"),
                       VAL=c(NA,NA,NA,NA))
newer <- data.frame(Member=c("third","first"),
                       VAL=c(2125,4587))

# 
merge.data.frame(older,newer,all=T)
  Member  VAL
  1  first 4587
  2  first   NA
  3 fourth   NA
  4 second   NA
  5  third 2125
  6  third   NA

That above is not exactly what I expect, I want to replace the older entries by newer ones, and not add another row. Like below. And I fail with merge.data.frame.

my.merge.fu(older,newer)
  Member  VAL
  1  first 4587
  2 second   NA
  3  third 2125
  4 fourth   NA

Kind of selective row replacement, where newer takes precedence and could not contain other Members than those in older.

Is there proper English term for such a R operation and is there prebuilt function for that?

Thank you.

Community
  • 1
  • 1
Petr Matousu
  • 3,120
  • 1
  • 20
  • 32
  • 1
    If you use `by='Member'` and `all=TRUE` you get a new column with the structure you're looking for. But I'm curious to know if you can use `merge` and return only that column. – Justin Dec 11 '12 at 16:51
  • 1
    There is similar question with two possible solution posted previously on [SO](http://stackoverflow.com/questions/3190118/how-to-use-merge-to-update-a-table-in-r/) – Didzis Elferts Dec 11 '12 at 17:00
  • @Petr Could `newer` contain a `fifth` entry? – Matthew Plourde Dec 11 '12 at 17:04
  • @Matthew Thanks for asking, no newer could not contain other Members than those in older. I made edit to add this important detail. – Petr Matousu Dec 11 '12 at 17:48
  • @Justin It looks like your hint could work well for me, at least small wraper function will do the job. Thank you very much for advice. – Petr Matousu Dec 11 '12 at 18:06
  • Thank you all for hints. Finally, I have chosen the method metioned in previously posted solution pointed out by Didzis. The method is older$VAL[match(newer$Member, older$Member)] <- newer$VAL, which gives me exactly what I need. – Petr Matousu Dec 11 '12 at 18:31

1 Answers1

1

You have effectively answered your own question.

If you want to deal with Matthew Ploude's point you could use

older$VAL[match(newer[newer$Member %in% older$Member, ]$Member, older$Member)
          ]  <- newer[newer$Member %in% older$Member, ]$VAL

This also the effect that where newer has multiple new values, it is the latest which ends up in older so for example

older <- data.frame(Member=c("first","second","third","fourth"),
                       VAL=c(1234,NA,NA,5678))
newer <- data.frame(Member=c("third","first","fifth","first"),
                       VAL=c(2125,4587,2233,9876))

older$VAL[match(newer[newer$Member %in% older$Member,]$Member, older$Member)
          ]  <- newer[newer$Member %in% older$Member,]$VAL

gives

> older
  Member  VAL
1  first 9876
2 second   NA
3  third 2125
4 fourth 5678
Henry
  • 6,704
  • 2
  • 23
  • 39