-1

I believe this is causing anywhere from a 5 minute to 20 minute delay depending on the number of records. I need to translate it into a LEFT JOIN but need some help getting it there.

qry_arr = array(':bill_type' => "INT");
$sql = "update ".$billing_table." c set c.bill_type = :bill_type";
$sql .= " WHERE  NOT EXISTS (SELECT s.abbreviation FROM   state s WHERE  s.abbreviation = c.out_location)";
$sql .= " and c.out_location != 'UNKNOWN' and c.out_location != ''";
user3783243
  • 5,368
  • 5
  • 22
  • 41
ComputerGiant
  • 159
  • 1
  • 1
  • 12
  • 1
    joins are almost always faster than sub-queries –  Dec 19 '18 at 21:43
  • 1
    @tim That's why he wants to convert it to a left join. – Barmar Dec 19 '18 at 21:43
  • Yes, use a join. – The Impaler Dec 19 '18 at 21:43
  • 1
    Maybe `$sql = "update $billing_table c left join state s on s.abbreviation = c.out_location set c.bill_type = :bill_type WHERE c.out_location != 'UNKNOWN' and c.out_location != ''"` Then you probably want `s.abbreviation` to be null for a non existent row – user3783243 Dec 19 '18 at 21:44
  • 1
    Does your table `billing_table` have an index on `out_location`? You should add it if not already present. – The Impaler Dec 19 '18 at 21:44
  • See https://meta.stackoverflow.com/questions/333952/why-should-i-provide-an-mcve-for-what-seems-to-me-to-be-a-very-simple-sql-query – Strawberry Dec 19 '18 at 21:45
  • The $billing_table is created via php as well. I need to review that code to see how to add an index to it with something like INDEX i (g) at the end of the code. I am onsite for the next 14 hours so when I return I'll post. – ComputerGiant Dec 20 '18 at 12:05

1 Answers1

1
UPDATE $billing_table c
LEFT JOIN state s ON s.abbreviation = c.out_location
SET c.bill_type = :bill_type
WHERE s.abbreviation IS NULL
AND c.out_location NOT IN ('UNKNOWN', '')

This is essentially the same as the syntax for a SELECT for the rows that don't match. See Return row only if value doesn't exist. Just replace SELECT ... FROM with UPDATE, and insert the SET clause before WHERE.

Make sure you have indexes on out_location and abbreviation.

Barmar
  • 741,623
  • 53
  • 500
  • 612
  • Are indexes required here? The code creates the $billing_table table first with no columns, then in a second call parses a CSV file to get the column names and adds them to the table. This would be a 3rd SQL call to create the index on them. Something like CREATE INDEX out_location_index ON $billing_table([out_location]); s.abbreviation from the state table is a permanent table and is indexed with names abbreviation, abbreviation2. Do I need to give the out_location index a new name when creating the index and reference that in your code above? – ComputerGiant Dec 20 '18 at 12:10
  • It might not be needed. If most rows of `$billing_table` need to be processed, then a full scan of the table doesn't add much more overhead than using the index. To be sure, you'll have to benchmark it. – Barmar Dec 20 '18 at 17:51
  • Will be testing this weekend and will update progress. Thanks! – ComputerGiant Dec 21 '18 at 13:33
  • This works but it turns out isn't where my slow code is. The code at the following link takes 31 minutes to run and needs the help. Thanks! https://stackoverflow.com/questions/53935927/speed-up-sql-update-statement-that-is-looped-through-a-select-statement – ComputerGiant Dec 26 '18 at 18:51
  • Do you have the necessary indexes? Have you tried using `EXPLAIN` to see the query plan for the analogous `SELECT` query? – Barmar Dec 26 '18 at 19:01
  • Definitely do not have indexes on these tables. EXPLAIN select id, did, customer_id from dids returns id = 1, select_type = SIMPLE, table = dids, type = ALL, possible_keys = NULL, key = NULL, key_len = NULL, ref = NULL, rows = 1955, Extra = '' – ComputerGiant Dec 26 '18 at 19:10
  • My answer says to make sure you have the necessary indexes. I'm not surprised it's slow without indexes. – Barmar Dec 26 '18 at 19:10
  • Yup. your code is actually quick. It's the next set of code that is slow. Can I add indexes using their original column names as the index name like this? ALTER TABLE dids ADD INDEX did (did) – ComputerGiant Dec 26 '18 at 19:13
  • Sure you can. Column names and index names are independent. – Barmar Dec 26 '18 at 19:15
  • 1
    In fact, if you don't give a name to the index, it automatically uses the column name by default. – Barmar Dec 26 '18 at 19:15
  • Adding those 4 indexes makes the processing time drop from 30+ minutes to 40 seconds. Thank you! – ComputerGiant Dec 26 '18 at 19:27