1

I have a partitioned table (by ID) that contains the following:

Name,ID,PASS
A,1,abc
B,2,dfg
C,3,jkl

I want to insert new rows into the above table, but also to update the old (if exist).

What I want to append:

NAME,ID,PASS
D, 4, asd
B, 2, kkk
C, 3, rrr

As you can see, the observations B and C are updates of the old table and what I want to have as final is the following:

Name,ID,PASS
A, 1, abc  # existed and stay as it was
B, 2, kkk  # existed but replaced with the new PASS
C, 3, rrr  # existed but replaced with the new PASS
D, 4, asd  # not existed and has appended

As a result, we have updated the old rows that were common with the table we wanted to append and also have the new rows as well.

I tried to use the INSERT OVERWRITE command in Hive but this just replaced the new table with the old, loosing the information that I wanted to keep. Any suggestions on how this can be done in Hive?

Vamkos
  • 125
  • 1
  • 8
  • 1
    Exactly the same: https://stackoverflow.com/a/37744071/2700344 or this solution: https://stackoverflow.com/a/44755825/2700344 – leftjoin Apr 18 '19 at 10:53
  • 1
    Possible duplicate of [Hive: Best way to do incremetal updates on a main table](https://stackoverflow.com/questions/37709411/hive-best-way-to-do-incremetal-updates-on-a-main-table) – leftjoin Apr 18 '19 at 10:56

0 Answers0