-2

I've a requirement to show the similar products in product detail page in such a way that the products listed should match the current product properties in a hierarchical way as given below

  1. Size, color, category, company should match with the current product
  2. Size, color, category should match with the current product
  3. Size, color should match with the current product
  4. Size should match with the current product

My sql query is as given below:

(SELECT pd.product_id, pd.name, p.price 
        FROM mg_product_description pd 
            JOIN `mg_product` p ON p.product_id = pd.product_id 
        WHERE 
            pd.size_id = '33' AND 
            pd.color_id = '2' AND 
            pd.category_id = '3' AND 
            pd.company_id = '1'  AND 
            pd.product_id != '53'  AND 
            p.status = '1' 
            ORDER BY RAND() LIMIT 10 
    )
        UNION
    (SELECT pd.product_id, pd.name, p.price 
        FROM mg_product_description pd 
            JOIN `mg_product` p ON p.product_id = pd.product_id 
        WHERE 
            pd.size_id = '33' AND 
            pd.color_id = '2' AND 
            pd.category_id = '3'  AND 
            pd.product_id != '53'  AND 
            p.status = '1' 
            ORDER BY RAND() LIMIT 10 
    )
        UNION
    (SELECT pd.product_id, pd.name, p.price 
        FROM mg_product_description pd 
            JOIN `mg_product` p ON p.product_id = pd.product_id 
        WHERE 
            pd.size_id = '33' AND 
            pd.color_id = '2' AND 
            pd.product_id != '53'  AND 
            p.status = '1' 
            ORDER BY RAND() LIMIT 10 
    )
        UNION
    (SELECT pd.product_id, pd.name, p.price 
        FROM mg_product_description pd 
            JOIN `mg_product` p ON p.product_id = pd.product_id 
        WHERE 
            pd.size_id = '33' AND 
            pd.product_id != '53' AND 
            p.status = '1'
            ORDER BY RAND() LIMIT 10 
    )

53 - current product id and status denotes available!

Is there any way to optimize the above query?

Note Output required: We need 10 similar products. If there exists 4 products matching condition 1, then they need to listed at first in random order. Similarly we need to list the products matching other conditions below it.

Thanks in Advance!

Sankar V
  • 4,110
  • 5
  • 28
  • 52
  • 1
    ORDER BY RAND() is generally slow, anyway you need to provide explain output. – Aurimas Ličkus Jul 09 '13 at 08:17
  • sorry for being unclear but i was talking about, http://dev.mysql.com/doc/refman/5.0/en/explain.html – Aurimas Ličkus Jul 09 '13 at 08:23
  • By "optimize" do you mean "improve the performance" or "reduce the complexity" of the above query? –  Jul 09 '13 at 08:27
  • Both. Priority for improving the performance !! – Sankar V Jul 09 '13 at 08:30
  • This query will never return anything if product 53 is the current (status='1') product - all the subqueries are joining `ON p.product_id = pd.product_id`, but all of them include the condition `pd.product_id != '53'` in the `where` clause. –  Jul 09 '13 at 08:45
  • Thanks for the comment. The query is working fine! I just need to optimize it. – Sankar V Jul 09 '13 at 08:56

1 Answers1

1

Keeping using ORDER BY RAND() (which is not efficient) then a very basic improvement would be to add a priority for each clause, move the order clause to the end so it is only needed once.

(SELECT pd.product_id, pd.name, p.price , 1 AS recpriority
        FROM mg_product_description pd 
            JOIN `mg_product` p ON p.product_id = pd.product_id 
        WHERE 
            pd.size_id = '33' AND 
            pd.color_id = '2' AND 
            pd.category_id = '3' AND 
            pd.company_id = '1'  AND 
            pd.product_id != '53'  AND 
            p.status = '1' 
    )
        UNION
    (SELECT pd.product_id, pd.name, p.price , 2 AS recpriority
        FROM mg_product_description pd 
            JOIN `mg_product` p ON p.product_id = pd.product_id 
        WHERE 
            pd.size_id = '33' AND 
            pd.color_id = '2' AND 
            pd.category_id = '3'  AND 
            pd.product_id != '53'  AND 
            p.status = '1' 
    )
        UNION
    (SELECT pd.product_id, pd.name, p.price , 3 AS recpriority
        FROM mg_product_description pd 
            JOIN `mg_product` p ON p.product_id = pd.product_id 
        WHERE 
            pd.size_id = '33' AND 
            pd.color_id = '2' AND 
            pd.product_id != '53'  AND 
            p.status = '1' 
    )
        UNION
    (SELECT pd.product_id, pd.name, p.price , 4 AS recpriority
        FROM mg_product_description pd 
            JOIN `mg_product` p ON p.product_id = pd.product_id 
        WHERE 
            pd.size_id = '33' AND 
            pd.product_id != '53' AND 
            p.status = '1'
    )
    ORDER BY recpriority, RAND() LIMIT 10 

That could then be done without the need for unions by doing something like this:-

SELECT pd.product_id, 
    pd.name, 
    p.price , 
    CASE
        WHEN pd.color_id = '2' AND pd.category_id = '3' AND pd.company_id = '1'  THEN 1 
        WHEN pd.color_id = '2' AND pd.category_id = '3' THEN 2 
        WHEN pd.color_id = '2' THEN 3 
        ELSE 4 
    END AS recpriority
FROM mg_product_description pd 
JOIN `mg_product` p ON p.product_id = pd.product_id 
WHERE pd.size_id = '33' 
AND pd.product_id != '53' 
AND p.status = '1'
ORDER BY recpriority, RAND() 
LIMIT 10 
Kickstart
  • 21,403
  • 2
  • 21
  • 33
  • Awesome! Thanks a lot :) ! Is there any other way to select random rows (...using ORDER BY RAND() (which is not efficient)...) ? – Sankar V Jul 09 '13 at 10:36
  • Yes. I have used more efficient methods for returning a single row (essentially you get a random value between the min and max ids, then select the first id above that random value), but not sure of one that can be used when you are retrieving 10. This post might be useful for you for determining a good random method - http://stackoverflow.com/questions/4329396/mysql-select-10-random-rows-from-600k-rows-fast – Kickstart Jul 09 '13 at 11:01