I got a sqoop export job for exporting per user unique counts from hdfs to mysql.
Here is the table used:
CREATE TABLE IF NOT EXISTS user_uniques (
user_id int(10) unsigned NOT NULL,
uniques int(10) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (user_id)
) ENGINE=InnoDB
Heres a sample insert/update statement being used:
INSERT INTO user_uniques (user_id, uniques) VALUES (1,200), (2,300), (3,400) ON DUPLICATE KEY UPDATE uniques = VALUES(uniques)
We are inserting at 100 records per statement and 100 statements per transaction. But the throughput is very low.
Are there any other upsert techniques that we can rely on for better performance?
Thanks!