1

Hopefully this is the right forum, my question seems to overlap the stack exchange community so this seemed best.

I have some custom reports for my WooCommerce orders on my wordpress site. I have one query that is just freezing locally, meaning in my localhost my CPU goes to 100% and it never finishes and I don't understand why. To the point here is the query:

SELECT SUM(postmeta.meta_value)
FROM pca_postmeta AS postmeta
LEFT JOIN pca_woocommerce_order_items AS orders ON orders.order_id = postmeta.post_id
WHERE postmeta.meta_key = '_order_total'
AND orders.order_item_id IN (
    SELECT item_meta.order_item_id 
    FROM pca_woocommerce_order_itemmeta AS item_meta 
    LEFT JOIN pca_woocommerce_order_items AS orders ON item_meta.order_item_id = orders.order_item_id 
    LEFT JOIN pca_posts AS posts ON posts.ID = orders.order_id 
    WHERE item_meta.meta_value = '23563' 
    AND posts.post_status IN ('wc-processing','wc-completed') 
    GROUP BY orders.order_id
)

As you can hopefully see the goal here is to get the summation of all orders from this specific campaign (23563). The nested query works exactly as expected on its own, returning just a list of IDs like so:

NOTE: little curious if 2.6289 secs is long when it only returned 65 total, although there are 148220 total

enter image description here

The problem is this query doesn't seem to like the nested part. Any suggestions? Completely different approach in mind?

P.S. I use that nested query at other times as well to represent all orders by campaign id in my php reporting class. But for my question PHP has nothing to do with it.

UPDATE/FOLLOW UP:

Is it possible to convert this into a join as described here: Using a SELECT statement within a WHERE clause ? I'm a little light on my SQL so not sure how I would do that but it seems promising

Community
  • 1
  • 1
Xtremefaith
  • 907
  • 1
  • 9
  • 29
  • The `IN ( SELECT ... )` construct optimizes very poorly until 5.6. Are you using that version? – Rick James May 15 '15 at 22:47
  • If woocommerce is generating that code, complain to them. Meanwhile, I will chalk it up to another case of 3rd party software being problematical. – Rick James May 15 '15 at 22:48
  • 5.7 has a "query rewrite" feature that allows you to rewrite such queries on the fly. It's a kludge to work around 3rd party deficiencies. – Rick James May 15 '15 at 22:49
  • None of this is generated by WordPress or WooCommerce, I simply need a report of orders with custom metadata. I'm not sure what version I'm using need to check, could be different on host. Currently I'm writing this directly in phpmyadmin – Xtremefaith May 16 '15 at 00:22

1 Answers1

2
GROUP BY  orders.order_id

does not make sense because you are selecting only order_item_id.

pca_woocommerce_order_itemmeta would benefit from

INDEX(meta_value, order_item_id)

An this might be an equivalent query, but avoiding the IN(SELECT...):

SELECT  SUM(pm.meta_value)
    FROM  
      ( SELECT  im.order_item_id
            FROM  pca_woocommerce_order_itemmeta AS im
            LEFT JOIN  pca_woocommerce_order_items AS o
                             ON im.order_item_id = o.order_item_id
            LEFT JOIN  pca_posts AS posts ON posts.ID = o.order_id
            WHERE  im.meta_value = '23563'
              AND  posts.post_status IN ('wc-processing','wc-completed')
            GROUP BY  o.order_id 
      ) AS w
    JOIN  pca_woocommerce_order_items AS o ON w.order_item_id = o.order_item_id
    JOIN  pca_postmeta AS pm ON o.order_id = pm.post_id
    WHERE  pm.meta_key = '_order_total'

Edit

Some principles behind what I did. Here I am guessing at what the optimizer will do with various possible formulations of the query.

  • I got rid of LEFT -- This may have changed the output. But I needed to avoid LEFT JOIN ( SELECT ... ) which would not be optimizable.
  • By having one subquery in the list of "tables" being JOINed, the optimizer will (almost certainly) start with the subquery and do "Nested Loop Joins" to the other tables. NLJ is the common way to perform a query.
  • A subselect like that has no index, so it needs to be first in the order, else it will be very inefficient.
  • Without subqueries, the optimizer generally likes to start with whichever table has something in the WHERE clause.
  • The requirement to start with the subquery "table" is stronger than the desire to pick the table based on WHERE pm.meta_key = '_order_total'.
  • Inside the subquery, the only "=" test (WHERE im.meta_value = '23563) provides the likely starting point for that set of JOINs. This is further enhanced by it not being 'right' of a LEFT JOIN. Hence, I suggested that index.
Rick James
  • 135,179
  • 13
  • 127
  • 222
  • I don't get how this works in the FROM like this but it seems to work fine. Trying to do this with a more complicated query now and failing because I think I lack understanding what is happening here. – Xtremefaith May 16 '15 at 08:38
  • 1
    Even this query is more complex than the typical. I edited my answer to include some of the logic I applied to get there. [My blog on building indexes](http://mysql.rjweb.org/doc.php/index_cookbook_mysql) is partially relevant here. – Rick James May 16 '15 at 15:11
  • Also I agree about the idea of indexing meta_value cause its used a lot between all my different reports but not sure how to do that. I tried the `index` link in phpmyadmin but got an error: ` #1170 - BLOB/TEXT column 'meta_value' used in key specification without a key length`, which I believe means I can't index this because it is a "longtext" type (utf8mb4_unicode_ci) – Xtremefaith May 16 '15 at 19:27
  • Oh. A key-value schema. Condolences. – Rick James May 16 '15 at 23:52