1

Basically, I have 3 tables: customer_profiles_lib, customer_profiles_tmp and customer_duplicates_tmp.

I want to check if each record from customer_profiles_lib is in customer_profiles_tmp... If it is not, INSERT into customer_profiles_tmp... If it is, INSERT INTO customer_duplicates_tmp.

I tried this in a procedure, but I have 9 million records to process and it's going way too slow... Here is what I have:

CREATE DEFINER=`company`@`%` PROCEDURE `customerImport`()
BEGIN
    DECLARE unique_id INT; 
    DECLARE fin INT;
    DECLARE curs CURSOR FOR SELECT customer_id AS unique_id FROM customer_profiles_lib;
    DECLARE CONTINUE HANDLER FOR NOT FOUND SET fin = 1;

    OPEN curs;
    SET fin = 0;
    REPEAT
    FETCH curs INTO unique_id;


    IF (SELECT EXISTS (SELECT customer_id FROM customer_profiles_tmp WHERE customer_id = unique_id)) THEN 
        SELECT unique_id AS 'ADDING'; 
        INSERT IGNORE INTO customer_duplicates_tmp (first, last, address_1, address_2, city, state, zipcode, email, customer_id, phone, store_number)
        SELECT first, last, address_1, address_2, city, state, zipcode, email, customer_id, phone, store_number FROM customer_profiles_lib WHERE customer_id = unique_id ORDER BY customer_profile_id DESC LIMIT 1;
    ELSE 
        SELECT unique_id AS 'SKIPPING'; 
        INSERT IGNORE INTO customer_profiles_tmp (first, last, address_1, address_2, city, state, zipcode, email, customer_id, phone, store_number)
        SELECT first, last, address_1, address_2, city, state, zipcode, email, customer_id, phone, store_number FROM customer_profiles_lib WHERE customer_id = unique_id ORDER BY customer_profile_id DESC LIMIT 1; 
    END IF; 

    UNTIL fin END REPEAT;
    CLOSE curs;
END

This way takes 1 hour, and works for inserts, but doesn't put anything in my customer_duplicates_tmp table.

INSERT IGNORE INTO customer_profiles_tmp (
first, 
last, 
address_1, 
address_2, 
city, 
state, 
zipcode, 
email, 
customer_id, 
phone, 
store_number
) 
SELECT 
tmp.first, 
tmp.last, 
tmp.address_1, 
tmp.address_2, 
tmp.city, 
tmp.state, 
tmp.zipcode, 
tmp.email, 
tmp.customer_id, 
tmp.phone, 
tmp.store_number 
FROM customer_profiles_lib AS tmp;

Thank you for your help!

brandoncluff
  • 303
  • 1
  • 5
  • 19
  • You could maybe use `on duplicate key` instead of `insert ignore` to insert the record into the other table, sort of like in this question: http://stackoverflow.com/questions/3884344/mysql-on-duplicate-key-insert-into-an-audit-or-log-table It's not very clean and pretty though. – pilsetnieks May 17 '16 at 22:43

1 Answers1

1

It seems the whole RBAR procedure can be replaced by two SQL statements with dramatic performance improvement:

INSERT IGNORE INTO customer_duplicates_tmp
  (first, last, address_1, address_2, city, state, zipcode, email, customer_id, phone, store_number)
SELECT
   first, last, address_1, address_2, city, state, zipcode, email, customer_id, phone, store_number
FROM customer_profiles_lib
WHERE customer_id IN (SELECT customer_id FROM customer_profiles_tmp);

INSERT IGNORE INTO customer_profiles_tmp
  (first, last, address_1, address_2, city, state, zipcode, email, customer_id, phone, store_number)
SELECT
   first, last, address_1, address_2, city, state, zipcode, email, customer_id, phone, store_number
FROM customer_profiles_lib
WHERE customer_id NOT IN (SELECT customer_id FROM customer_profiles_tmp);
Y.B.
  • 3,526
  • 14
  • 24