Create a table with primary key in Hive. Insert the identical data record several times.
How can you avoid that the data record (primary key) is not inserted more than once without using a second temporary table?
drop table t1;
CREATE TABLE IF NOT EXISTS `t1` (
`ID` BIGINT DEFAULT SURROGATE_KEY(),
`Name` STRING NOT NULL DISABLE NOVALIDATE,
CONSTRAINT `PK_t1` PRIMARY KEY (`ID`) DISABLE NOVALIDATE);
select * from t1;
+--------+----------+
| t1.id | t1.name |
+--------+----------+
+--------+----------+
insert into t1 values (1, "Hi");
insert into t1 values (1, "Hi");
insert into t1 values (1, "Hi");
select * from t1;
+--------+----------+
| t1.id | t1.name |
+--------+----------+
| 1 | Hi |
| 1 | Hi |
| 1 | Hi |
+--------+----------+
I tried unsuccessfully with a merge:
MERGE INTO t1
USING (select * from t1) sub
ON sub.id != t1.id
WHEN not matched then insert values (2, "World");