Identify what makes slow.
check JOIN is optimized
run SELECT only:
SELECT COUNT(*)
FROM xxx.product p LEFT JOIN xx.tof_art_lookup l
ON p.model_view = l.ARL_SEARCH_NUMBER;
how long takes? and EXPLAIN SELECT ...
check proper INDEX
is used for JOIN.
If everything is fine for JOIN, then UPDATEing row is slow. this situation is hard to make things faster.
UPDATE = DELETE and INSERT
I didn't tried this. but sometimes, this strategy is faster.. UPDATE
is DELETE old row and INSERT new row using new value.
// CREATE new table and INSERT
CREATE TABLE xxx.new_product
SELECT p.model_model, l. ARL_DISPLAY_NR, ...
FROM xxx.product p LEFT JOIN xx.tof_art_lookup l
ON p.model_view = l.ARL_SEARCH_NUMBER;
// drop xxx.procuct
// rename xxx.new_product to xxx.product
divide table into small chunk, and run concurrently
I think your job is CPU bounded and your UPDATE query uses just one CPU can't have benefit many cores. xxx.product TABLE has no constraint for join, there for 1M rows are updated sequencially
My suggestion following.
give some conditions to xxx.product so that xxx.product divided 20 group. (I don't no which column would be better for you, as I have no information about xxx.product)
then run 20 queries at once concurrently.
for example:
// for 1st chunk
UPDATE xxx.product AS p
...
WHERE p.model_view = l.ARL_SEARCH_NUMBER
AND p.column BETWEEN val1 AND val2; <= this condition spliting xxx.product
// for 2nd chunk
UPDATE xxx.product AS p
...
WHERE p.model_view = l.ARL_SEARCH_NUMBER
AND p.column BETWEEN val2 AND val3;
...
...
// for 20th chunk
UPDATE xxx.product AS p
...
WHERE p.model_view = l.ARL_SEARCH_NUMBER
AND p.column BETWEEN val19 AND val20;
It is important to find BETWEEN value distribute table evenly. Histogram may help you. Getting data for histogram plot