Following on from a question I posted a couple of weeks back, the original question has been resolved but I've got some follow up questions relating to performance.
Firstly, here's an example of the stored procedure that addresses the original question, there are 23 values in the INSERT statement but I hope below paints a clear enough picture:
DELIMITER ;;
CREATE PROCEDURE myprocedure()
BEGIN
DECLARE n INT DEFAULT 0;
DECLARE i INT DEFAULT 1;
SELECT COUNT(*) FROM temp_table INTO n;
SET i=1;
WHILE i<=n DO
SET @dataId :=IFNULL((SELECT rowId FROM perm_table WHERE CONCAT(ID, somedate) = (SELECT CONCAT(ID, somedate) FROM temp_table WHERE temp_rowId = i)), 0);
INSERT INTO perm_table (val1,
val2,
val3,
...
)
SELECT @dataId,
val2,
val3,
...
FROM temp_table
WHERE temp_rowId = i
ON DUPLICATE KEY UPDATE val5 = (SELECT val5 FROM temp_table WHERE temp_rowId = i),
val6 = (SELECT val6 FROM temp_table WHERE temp_rowId = i),
val7 = (SELECT val7 FROM temp_table WHERE temp_rowId = i);
SET i = i + 1;
END WHILE;
TRUNCATE TABLE temp_table;
End;
;;
DELIMITER ;
This procedure produces the expected result in terms of row count, but the throughput is steady at 10 rows per second which results in a runtime of ~3.5 hours for an update across 120k rows which I'd really like to optimize further.
Here's a few things I've already tried to improve performance, unfortunately none of these attempts seem to have resulted in a meaningful drop in runtime:
- temp_table and perm_table are both indexed on all relevant fields, also added composite index on (ID, somedate) to each
- RDS (db.m5.large) storage using provisioned iops
- innodb_flush_log_at_trx_commit=0
Just wondering if there are any other suggestions as to what I might be able to look at to improve the runtime?