I have a partitioned table (by ID) that contains the following:
Name,ID,PASS
A,1,abc
B,2,dfg
C,3,jkl
I want to insert new rows into the above table, but also to update the old (if exist).
What I want to append:
NAME,ID,PASS
D, 4, asd
B, 2, kkk
C, 3, rrr
As you can see, the observations B and C are updates of the old table and what I want to have as final is the following:
Name,ID,PASS
A, 1, abc # existed and stay as it was
B, 2, kkk # existed but replaced with the new PASS
C, 3, rrr # existed but replaced with the new PASS
D, 4, asd # not existed and has appended
As a result, we have updated the old rows that were common with the table we wanted to append and also have the new rows as well.
I tried to use the INSERT OVERWRITE command in Hive but this just replaced the new table with the old, loosing the information that I wanted to keep. Any suggestions on how this can be done in Hive?