0

I got a sqoop export job for exporting per user unique counts from hdfs to mysql.

Here is the table used:

CREATE TABLE IF NOT EXISTS user_uniques (
  user_id int(10) unsigned NOT NULL,
  uniques int(10) unsigned NOT NULL DEFAULT '0',
  PRIMARY KEY (user_id)
) ENGINE=InnoDB

Heres a sample insert/update statement being used:

INSERT INTO user_uniques (user_id, uniques) VALUES (1,200), (2,300), (3,400) ON DUPLICATE KEY UPDATE uniques = VALUES(uniques)

We are inserting at 100 records per statement and 100 statements per transaction. But the throughput is very low.

Are there any other upsert techniques that we can rely on for better performance?

Thanks!

user2727704
  • 625
  • 1
  • 10
  • 21
  • `unsigned NOT`? that's not a valid sql data type... `unsigned NOT NULL` maybe, but NOT by itself? flat out syntax error. – Marc B Jun 24 '14 at 19:01
  • Sorry, typo when composing the post. – user2727704 Jun 24 '14 at 19:03
  • Which engine are you using? If you are using InnoDb, there's a performance cost for unordered primary keys: http://stackoverflow.com/questions/23470130/random-primary-key-for-innodb – wils484 Jun 24 '14 at 19:06
  • Thanks wils484 for the pointers. We are using InnoDB but the problem is we don't have many insert, mostly will be updates. The insert aren't taking much time when we have the insert though. – user2727704 Jun 24 '14 at 19:19

0 Answers0