0

The structure of the new_table is that it will never insert new rows, but will only update the existing ones.

Currently I am using the following query. Is there any way to optimize this (in terms of performance) to achieve faster updates of the rows?

INSERT INTO new_table (hash, pages, visits, last_visit)
   SELECT A.hash, COUNT(B.id), A.visits, MAX(B.timestamp)
      FROM audience A
JOIN behaviour B ON B.hash = A.hash
GROUP BY A.hash
   ON DUPLICATE KEY UPDATE
      new_table.pages=VALUES(pages), 
      new_table.visits=VALUES(visits), 
      new_table.last_visit=VALUES(last_visit)
EnexoOnoma
  • 8,454
  • 18
  • 94
  • 179
  • 1
    can you show us the output from the SELECT part like : **EXPLAIN SELECT A.hash, COUNT(B.id), A.visits, MAX(B.timestamp) FROM audience A JOIN behaviour B ON B.hash = A.hash GROUP BY A.hash** – Bernd Buffen Jan 04 '17 at 23:13
  • 1
    Have you tried an `UPDATE...JOIN`? – shmosel Jan 04 '17 at 23:15
  • Do all Cat haters expect others to do all their work for free http://stackoverflow.com/questions/41468765/how-to-combine-these-queries-to-achieve-faster-performance – RiggsFolly Jan 04 '17 at 23:16
  • @BerndBuffen I have updated my question with the image. thank you – EnexoOnoma Jan 04 '17 at 23:22
  • @shmosel using Update i know how to assign values like `column = $variable` – EnexoOnoma Jan 04 '17 at 23:23
  • @Xalloumokkelos - is there really only 1 row in table audience and 6044 rows in table behaviour with the same hash value. can you also show the create of both tables – Bernd Buffen Jan 04 '17 at 23:31
  • @BerndBuffen well that was the result of it.... :/ – EnexoOnoma Jan 04 '17 at 23:38
  • Why don't you use `UPDATE` if you never need to insert new rows? – Barmar Jan 04 '17 at 23:40
  • @Barmar Can you show me how? I can do it using PHP, but first I need to select all the `hash` and then update each of them one by one. But I don't think that this is the right way, because I believe there must be a way to achieve this by plain mysql query. – EnexoOnoma Jan 04 '17 at 23:42

1 Answers1

1

Use UPDATE rather than INSERT.

UPDATE new_table n
JOIN audience AS a on a.hash = n.hash
JOIN (
    SELECT hash, COUNT(*) AS pages, MAX(timestamp) AS last_visit
    FROM behaviour
    GROUP BY hash) AS b ON b.hash = n.hash
SET n.pages = b.pages, n.visits = a.visits, n.last_visit = b.last_visits
Barmar
  • 741,623
  • 53
  • 500
  • 612
  • hi again and thank you for this. I tested it and it seems that it works. to be honest, compared with what I had, it was executed 25 seconds faster. 39 vs 14. So it has a huge impact on the performance! A question to understand the theory on this: in my example it tries first to insert and then update, and you removed the "insert" part, right? – EnexoOnoma Jan 04 '17 at 23:58
  • Exactly. This just updates, it doesn't waste time trying to insert and then failing over to the update code. – Barmar Jan 05 '17 at 00:01
  • thank you once again. you have been really helpful. upvote + accepted. – EnexoOnoma Jan 05 '17 at 00:03