1

Here's an SQL statement (actually two statements) that works -- it's taking a series of matching rows and adding a delivery_number which increments for each row:

SELECT @i:=0;
UPDATE pipeline_deliveries AS d
SET d.delivery_number = @i:=@i+1
WHERE d.pipelineID = 11
ORDER BY d.setup_time;

But now, the client no longer wants them ordered by setup_time. They needed to be ordered according to departure time, which is a field in another table. I can't figure out how to do this.

The MySQL docs, as well as this answer, suggest that in version 4.0 and up (we're running MySQL 5.0) I should be able to do this:

SELECT @i:=0;
UPDATE pipeline_deliveries AS d RIGHT JOIN pipeline_routesXdeliveryID AS rXd
    ON d.pipeline_deliveryID = rXd.pipeline_deliveryID
LEFT JOIN pipeline_routes AS r
    ON rXd.pipeline_routeID = r.pipeline_routeID
SET d.delivery_number = @i:=@i+1
WHERE d.pipelineID = 11
ORDER BY r.departure_time,d.pipeline_deliveryID;

but I get the error #1221 - Incorrect usage of UPDATE and ORDER BY.

So what's the correct usage?

Community
  • 1
  • 1
Blazemonger
  • 90,923
  • 26
  • 142
  • 180

3 Answers3

2

You can't mix UPDATE joining 2 (or more) tables and ORDER BY.

You can bypass the limitation, with something like this:

UPDATE 
    pipeline_deliveries AS upd
  JOIN
    ( SELECT t.pipeline_deliveryID, 
             @i := @i+1 AS row_number 
      FROM 
          ( SELECT @i:=0 ) AS dummy
        CROSS JOIN 
          ( SELECT d.pipeline_deliveryID
            FROM 
                pipeline_deliveries AS d 
              JOIN 
                pipeline_routesXdeliveryID AS rXd
                  ON d.pipeline_deliveryID = rXd.pipeline_deliveryID
              LEFT JOIN 
                pipeline_routes AS r
                  ON rXd.pipeline_routeID = r.pipeline_routeID
            WHERE 
                d.pipelineID = 11
            ORDER BY 
                r.departure_time, d.pipeline_deliveryID
          ) AS t
    ) AS tmp
      ON tmp.pipeline_deliveryID = upd.pipeline_deliveryID
SET 
    upd.delivery_number = tmp.row_number ;

The above uses two features of MySQL, user defined variables and ordering inside a derived table. Because the latter is not standard SQL, it may very well break in a feature release of MySQL (when the optimizer is clever enough to figure out that ordering inside a derived table is useless unless there is a LIMIT clause). In fact the query would do exactly that in the latest versions of MariaDB (5.3 and 5.5). It would run as if the ORDER BY was not there and the results would not be the expected. See a related question at MariaDB site: GROUP BY trick has been optimized away.

The same may very well happen in any future release of main-strean MySQL (maybe in 5.6, anyone care to test this?) that will improve the optimizer code.

So, it's better to write this in standard SQL. The best would be window functions which haven't been implemented yet. But you could also use a self-join, which will be not very bad regarding efficiency, as long as you are dealing with a small subset of rows to be affected by the update.

UPDATE 
    pipeline_deliveries AS upd
  JOIN
    ( SELECT t1.pipeline_deliveryID
           , COUNT(*) AS row_number
      FROM
          ( SELECT d.pipeline_deliveryID
                 , r.departure_time
            FROM 
                pipeline_deliveries AS d 
              JOIN 
                pipeline_routesXdeliveryID AS rXd
                  ON d.pipeline_deliveryID = rXd.pipeline_deliveryID
              LEFT JOIN 
                pipeline_routes AS r
                  ON rXd.pipeline_routeID = r.pipeline_routeID
            WHERE 
                d.pipelineID = 11
          ) AS t1
        JOIN
          ( SELECT d.pipeline_deliveryID
                 , r.departure_time
            FROM 
                pipeline_deliveries AS d 
              JOIN 
                pipeline_routesXdeliveryID AS rXd
                  ON d.pipeline_deliveryID = rXd.pipeline_deliveryID
              LEFT JOIN 
                pipeline_routes AS r
                  ON rXd.pipeline_routeID = r.pipeline_routeID
            WHERE 
                d.pipelineID = 11
          ) AS t2
          ON t2.departure_time < t2.departure_time
          OR t2.departure_time = t2.departure_time 
             AND t2.pipeline_deliveryID <= t1.pipeline_deliveryID
          OR t1.departure_time IS NULL
             AND ( t2.departure_time IS NOT NULL
                OR t2.departure_time IS NULL
                   AND t2.pipeline_deliveryID <= t1.pipeline_deliveryID
                 )
      GROUP BY
          t1.pipeline_deliveryID  
    ) AS tmp
      ON tmp.pipeline_deliveryID = upd.pipeline_deliveryID
SET 
    upd.delivery_number = tmp.row_number ;
ypercubeᵀᴹ
  • 113,259
  • 19
  • 174
  • 235
  • Seems to work. Can you add comments to the query, for posterity? – Blazemonger Jan 14 '13 at 19:13
  • Been working on this and debugging it all day. For whatever reason, the `delivery_number`s aren't being assigned in order of `r.departure_time`, but in order of `r.pipeline_routeID` instead. I'm mystified as to why. – Blazemonger Jan 14 '13 at 22:08
  • Oh, let me rewrite this. It's possible yes, that it's assigning values to row_number` and then using the `ORDER BY` clause. – ypercubeᵀᴹ Jan 15 '13 at 06:21
  • Edited. Can you estimate on how many rows will be updated by this? Just a few (hundreds)? Or much more? – ypercubeᵀᴹ Jan 15 '13 at 12:08
  • Wow. What a monster. When will MySQL finally have window functions? –  Jan 15 '13 at 12:09
  • @a_horse_with_no_name Yes, it is (a monster). That's why I asked about number of affected rows (a small number could be done in a self-join) – ypercubeᵀᴹ Jan 15 '13 at 12:10
  • So far it appears to work beautifully. It's comforting to know that I never would have thought of this one myself. – Blazemonger Jan 15 '13 at 14:56
1

Based on this documentation

For the multiple-table syntax, UPDATE updates rows in each table named in table_references that satisfy the conditions. In this case, ORDER BY and LIMIT cannot be used.

Without knowing too much about MySQL you could open up a cursor and process this row by row, or by passing it back to the client code (PHP,Java, etc) that you maintain to handle this processing.

After more digging:

To eliminate the badly optimized subquery, you need to rewrite the subquery as a join, but how can you do that and retain the LIMIT and ORDER BY? One way is to find the rows to be updated in a subquery in the FROM clause, so the LIMIT and ORDER BY can be nested inside the subquery. In this way work_to_do is joined against the ten highest-priority unclaimed rows of itself. Normally you can’t self-join the update target in a multi-table UPDATE, but since it’s within a subquery in the FROM clause, it works in this case.

update work_to_do as target
   inner join (
      select w. client, work_unit
      from work_to_do as w
         inner join eligible_client as e on e.client = w.client
      where processor = 0
      order by priority desc
      limit 10
   ) as source on source.client = target.client
      and source.work_unit = target.work_unit
   set processor = @process_id;

There is one downside: the rows are not locked in primary key order. This may help explain the occasional deadlock we get on this table

Woot4Moo
  • 23,987
  • 16
  • 94
  • 151
0

The hard way:-


    ALTER TABLE eav_attribute_option 
        ADD temp_value TEXT NOT NULL 
        AFTER sort_order;
    UPDATE eav_attribute_option o
        JOIN eav_attribute_option_value ov ON o.option_id=ov.option_id 
        SET o.temp_value = ov.value 
        WHERE o.attribute_id=90;
    SET @x = 0;
    UPDATE eav_attribute_option 
        SET sort_order = (@x:=@x+1) 
        WHERE attribute_id=90 
        ORDER BY temp_value ASC;
    ALTER TABLE eav_attribute_option
        DROP temp_value;

Dallas Clarke
  • 261
  • 2
  • 5