0

Trying to improve upon an import process which currently relies on a user manipulating data in an excel spreadsheet so that it can be imported into a permanent database table, my current line of thinking was to import the spreadsheet data into a temp table and then use that to update the permanent table - the columns contained in both tables would be identical in terms of data types.

The permanent table

ID col1 col2 col3
29384756 WWWWW "2021-01-01 00:00:00" current value
29384757 XXXXX "2021-01-02 00:00:00" current value
29384758 KLMNO "2022-02-24 00:00:00" current value
29384759 YYYYY "2021-01-03 00:00:00" current value
29384760 ZZZZZ "2021-01-04 00:00:00" current value

The temporary table

ID col1 col2 col3
1 ABCDE "2022-02-22 00:00:00" new value
2 FGHIJ "2022-02-23 00:00:00" new value
3 KLMNO "2022-02-24 00:00:00" new value
4 PQWRS "2022-02-25 00:00:00" new value
5 TUVWX "2022-02-26 00:00:00" new value

I want to iterate through each row of the temporary table and perform a duplicate check based on the combination of col1 and col2 from the temp table against the same columns in the perm table.

If the duplicate check returns a result from the perm table, then an UPDATE should be performed

If the duplicate check returns no result from the perm table, then an INSERT should be performed

The permanent table after update from temp table

ID col1 col2 col3
29384756 WWWWW "2022-02-22 00:00:00" current value
29384757 XXXXX "2022-02-23 00:00:00" current value
29384758 KLMNO "2022-02-24 00:00:00" new value
29384759 YYYYY "2022-02-25 00:00:00" current value
29384760 ZZZZZ "2022-02-26 00:00:00" current value
29384761 ABCDE "2022-02-22 00:00:00" new value
29384762 FGHIJ "2022-02-23 00:00:00" new value
29384763 PQWRS "2022-02-25 00:00:00" new value
29384764 TUVWX "2022-02-26 00:00:00" new value

The 3rd row from the temp table matched an existing record in the permanent table ("KLMNO2022-02-24 00:00:00" so the col3 value for that row was updated however all other records were inserted as new rows.

The excel solution builds an SQL query for each row in the spreadsheet, using a variable to determine whether or not the row is a duplicate.

For the UPDATE example above the query would look like:

@rowId := IFNULL( (SELECT ID FROM perm_table WHERE CONCAT(col1, col2) = 'KLMNO2022-02-24 00:00:00'),'');insert into perm_table values(@rowId, col1, col2, col3) ON DUPLICATE KEY UPDATE col3 = 'new value';

In that query we get the values 'KLMNO2022-02-24 00:00:00' and 'new value' from cells in a spreadsheet, I'm trying to do the same thing but instead of a spreadsheet, the temporary table is an SQL table.

Jason
  • 11
  • 4
  • Can you , or do you have a compound unique key on col1/col2? Are you aware of https://dev.mysql.com/doc/refman/8.0/en/insert-on-duplicate.html https://stackoverflow.com/questions/14383503/on-duplicate-key-update-same-as-insert – P.Salmon Mar 17 '22 at 07:39
  • Use `INSERT INTO ... ON DUPLICATE KEY UPDATE` – Barmar Mar 17 '22 at 08:00
  • Already doing that in the excel solution (upserting), trying to adapt it to an iterative approach in MySQL. Will add some extra info to the original post. – Jason Mar 17 '22 at 09:43
  • Solved, but it's slow, see [new question](https://stackoverflow.com/questions/71746342/mysql-5-7-performance-of-stored-procedure) for more info. – Jason Apr 05 '22 at 04:35

0 Answers0