4

I am in need of resetting flag of a table 'A' from 'X' to 'Y' where the update_date of a row satisfies the conditions 1. update_date > 1 month, 2. flag = 'X' & 3. type = 1.

And the update_date is checked against another table 'B'. I hope the following query will explain what exactly I need. Also this query works fine for me. But the problem is it is taking too long time. Actually my tables A & B are much bigger almost contains billion rows and there are about 10 columns.

When I run my sub query for selecting A.id I got the result immediately.

 SELECT a.id 
 FROM A a 
 JOIN B b 
 ON (a.id = b.id 
       AND a.name = b.name 
       AND a.type = 1 
       AND a.flag = 'X' 
       AND a.update_date > DATE_SUB(NOW(), INTERVAL 1 MONTH) tmp_table)

But only the update query even if I put limit also it's taking much time.

UPDATE A 
SET flag='Y' 
WHERE id IN (SELECT a.id 
             FROM A a 
             JOIN B b 
             ON (a.id = b.id 
                   AND a.name = b.name 
                   AND a.type = 1 
                   AND a.flag = 'X' 
                   AND a.update_date > DATE_SUB(NOW(), INTERVAL 1 MONTH) tmp_table))  
             LIMIT 100

I am looking for alternate solutions of my query which makes it fast. Hope I could write a stored procedure for it. But in SP I should loop through for each target_ids right?

I don't wish to write two separate queries in PHP, since there are many threads of my PHP scripts running on cron which returns same results (time latency).

Also to note, I do have enough indexing for columns.

Wish to update limits by limit. ie., update 1000+ records for every run.

Garfield
  • 2,487
  • 4
  • 31
  • 54

3 Answers3

3

Change in with exists

EXISTS will be faster because once the engine has found a hit, it will quit looking as the condition has proved true. With IN it will collect all the results from the subquery before further processing.

UPDATE A a
JOIN B b 
ON (a.id = b.id 
               AND a.name = b.name 
               AND a.type = 1 
               AND a.flag = 'X' 
               AND a.update_date > DATE_SUB(NOW(), INTERVAL 1 MONTH))
SET a.flag='Y'
ORDER BY a.id LIMIT 1000;

EDITED Supporting substitute of LIMIT (IT will update only 1st 100 records)

SET @rn = 0;
UPDATE A a
JOIN (SELECT @rn:=@rn+1 AS rId, id, name FROM B b 
      JOIN A a 
      ON (@rn < 100 AND a.id = b.id 
               AND a.name = b.name 
               AND a.type = 1 
               AND a.flag = 'X' 
               AND a.update_date > DATE_SUB(NOW(), INTERVAL 1 MONTH)
      ) 

) b 
ON (a.id=b.id) 
SET a.flag='Y'
WHERE b.rId < 100;

Using exist clause

Update A a 
SET a.flag='Y'
WHERE EXISTS (SELECT 1 FROM B b WHERE a.id = b.id 
               AND a.name = b.name 
               AND a.type = 1 
               AND a.flag = 'X' 
               AND a.update_date > DATE_SUB(NOW(), INTERVAL 1 MONTH)) 
ORDER BY a.id LIMIT 1000;

Hope this helps

Meherzad
  • 8,433
  • 1
  • 30
  • 40
  • Is it like this `UPDATE A SET flag='Y' WHERE EXISTS (` instead of `UPDATE A SET flag='Y' WHERE id IN (` – Garfield Mar 06 '13 at 11:52
  • It worked fast :) but it updated the records which has update_date less than 30 days! – Garfield Mar 06 '13 at 11:57
  • I tried 1st query: Column 'flag' in field list is ambiguous. Since I have flag in both tables! How can I handle this? – Garfield Mar 06 '13 at 12:10
  • Thanks! one more, if I want to update limits by limits? I put LIMIT it said incorrect usage of LIMIT & UPDATE. I put LIMIT at last. – Garfield Mar 06 '13 at 12:14
  • Limit alone is not possible you need to add `order by LIMIT ` to specify the limit on what column. – Meherzad Mar 06 '13 at 12:22
  • Limit throws error for 1st qry, 2nd is working, but still taking more time just for updating 10 records. Still my query is running :( – Garfield Mar 06 '13 at 12:30
  • `Incorrect usage of UPDATE and ORDER BY` for 1st one. but really really still my query keep on running! – Garfield Mar 06 '13 at 12:58
  • 1
    In multiple table reference query it wont be possible to use LIMIT http://stackoverflow.com/questions/4291833/mysql-error-incorrect-usage-of-update-and-limit. It would be better if you use a condition in where clause which limits number of records. – Meherzad Mar 06 '13 at 13:02
  • Yes `Update A a SET a.flag='Y' WHERE EXISTS (SELECT 1 FR...` taking much time but no error. `UPDATE A a JOIN B b ON (a.id = b.id` throwing error as i said above. So I decided to go with 2nd one since no limit support. but still takes a lot of time.. – Garfield Mar 06 '13 at 13:03
  • Both the queries are running slow ? Check the edit for first query – Meherzad Mar 06 '13 at 13:08
  • Yeah. I have the 8,000,000+ rows which satisfies my condition! – Garfield Mar 06 '13 at 13:10
  • `JOIN (SELECT @rn:=@rn+1 AS rId, id name FROM B b ` should be `JOIN (SELECT @rn:=@rn+1 AS rId, id, name FROM B b `? comma missing, I tried to edit it here. – Garfield Mar 06 '13 at 13:33
  • Yes changed that typo also added the @rn < 100 condition – Meherzad Mar 06 '13 at 13:42
  • Thank you so much... it works great... I have to run the same for another couple of tables. but in this case, I do not have `id` col in both table. I have `id` in `A` but `A & B` has common field 'title'. I should select the `A.a` by checking `A.title = B.title` and with the same other conditions. I think I should post an another question for this. – Garfield Mar 06 '13 at 13:59
  • I got an another solution which worked perfectly! Meherzad you could check my answer. – Garfield Mar 07 '13 at 11:49
0

You can use a join too

UPDATE A
  LEFT JOIN (SELECT
           a.id
         FROM A AS a
           JOIN B AS b
         ON a.id = b.id
         WHERE a.name = b.name
         AND a.type = 1
         AND a.flag = 'X'
         AND a.update_date > DATE_SUB(NOW(), INTERVAL 1 MONTH)) AS l
    ON l.id = A.id
SET flag = 'Y'
WHERE id = l.id
Muhammad Raheel
  • 19,823
  • 7
  • 67
  • 103
0

Finally, I got the better performing optimized query. Simply A join to temp table.

UPDATE A AS a JOIN (
SELECT a.id FROM A AS a JOIN B AS b ON
    b.type = a.type 
    AND b.name = a.name 
    AND b.last_update_date < DATE_SUB(NOW(), INTERVAL 1 MONTH) 
    AND a.type = 1 
    AND a.flag = 'X' 
ORDER BY a.id DESC LIMIT 1000)
AS source ON source.id = a.id
SET flag = 'Y';

Thanks to http://www.xaprb.com/blog/2006/08/10/how-to-use-order-by-and-limit-on-multi-table-updates-in-mysql

Garfield
  • 2,487
  • 4
  • 31
  • 54