1

I would like to SELECT certain data out of my mysql DB. I am working with a php loop and a sql statement with a LIMIT and UNION.

Problem: The speed of my query is terrible. One UNION statement tooks 2-4 seconds. Due to the loop the Overall-Query takes 3 Minutes. Is there a chance to optimize my query?

I tried to separate the "three" statements and merge the results. But this is not really faster. So I think that the UNION is not my problem.

PHP/SQL:

My code is running through two-foreach-loops. The code is working properly. But the performance is the problem.

$sql_country = "SELECT country FROM country_list"; 
foreach ($db->query($sql_country) as $row_country) { //first loop (150 entries)
    $sql_color = "SELECT color FROM color_list";

    foreach ($db->query($sql_color) as $row_color) { //second loop (10 entries)
        $sql_all = "(SELECT ID, price FROM company 
                            WHERE country = '".$row_country['country']."' 
                                AND color = '".$row_color['color']."'  
                                AND price BETWEEN 2.5 AND 4.5  
                                order by price DESC LIMIT 2)

                            UNION

                            (SELECT ID, price FROM company 
                            WHERE country = '".$row_country['country']."' 
                                AND color = '".$row_color['color']."'  
                                AND price BETWEEN 5.5 AND 8.2  
                                order by price DESC LIMIT 2)

                            UNION 

                            (SELECT ID, price FROM company 
                            WHERE country = '".$row_country['country']."' 
                                AND color = '".$row_color['color']."'  
                                AND price BETWEEN 8.5 AND 10.8  
                                order by price DESC LIMIT 2)";
        foreach ($db->query($sql_all) as $row_all) {
                $shopID[] = $row_all['ID']; //I just need these IDs
        }
    }
}

Do you have any idea or hints to get this faster?

Zeshan
  • 2,496
  • 3
  • 21
  • 26
Ben
  • 73
  • 7
  • 3
    Create a composite index on `(country, color, price, ID)` – Paul Spiegel Aug 08 '19 at 18:54
  • Also, between is usually insanely slow even with keyed fields. – mlewis54 Aug 08 '19 at 19:09
  • 1
    It looks like you are trying to something like [Get top n records for each group of grouped results](https://stackoverflow.com/q/12113699). (If you are on MySQL 8, you can also use windows functions, you would have to add that information). This will probably be faster than running 150*10 queries. – Solarflare Aug 08 '19 at 19:41
  • I would combine like 100 queries into huge UNION ALL statements in 15 chunks. UNION ALL is still the fastest solution for big groups. ROW_NUMBER() requires a full table/index scan. – Paul Spiegel Aug 08 '19 at 19:57
  • @PaulSpiegel : Your first comment helped me a lot. I could reduce the time from 25 minutes down to 20 seconds. Thanks a lot. – Ben Aug 10 '19 at 14:00
  • @mlewis54 Is there any other Chance to solve this instead of using "between"? – Ben Aug 10 '19 at 14:01
  • @Solarflare Thanks. This link helped as well. There are a lot of advices and hints. – Ben Aug 10 '19 at 14:02
  • I might miss something, but I don't see any issue with BETWEEN in this query. `WHERE a=1, b=2, c BETWEEN 1000 AND 2000 ORDER BY c LIMIT 2` should be efficient with an index on `(a, b, c)`. – Paul Spiegel Aug 10 '19 at 14:07
  • @Ben not really. You can do a >= and <= but I'm pretty sure internally that MySQL will convert to that internally. It's just a really expensive (computationally) operation. If the field is keyed it will be faster. – mlewis54 Aug 12 '19 at 16:48

1 Answers1

1

An index on (country, color, price, ID) should improve the performance of single queries from seconds to a couple of milliseconds or even less. But you still have the problem of executing 1500 queries. Depending on your system, a single query execution can add an overhead of about 10 ms, which would add up to 15 seconds in your case. You need to find a way to minimize the number of queries - In best case to a single query.

For low limits (like 2 in your case), you can combine multiple LIMIT 1 subqueries with different offsets. I would generate such a query dynamically.

$priceRanges = [
    ['2.5', '4.5'],
    ['5.5', '8.2'],
    ['8.5', '10.8'],
];

$limit = 2;
$offsets = range(0, $limit - 1);

$queryParts = [];
foreach ($priceRanges as $range) {
    $rangeFrom = $range[0];
    $rangeTo   = $range[1];
    foreach ($offsets as $offset) {
        $queryParts[] = "
            select (
              select ID
              from company cmp
              where cmp.country = cnt.country
                and cmp.color   = clr.color
                and cmp.price between {$rangeFrom} AND {$rangeTo}
              order by cmp.price desc
              limit 1
              offset {$offset}
            ) as ID
            from country_list cnt
            cross join color_list clr
            having ID is not null
        ";
    }
}

$query = implode(' UNION ALL ', $queryParts);

This will generate a quite long UNION query. You can see a PHP demo on rexester.com and SQL demo on db-fiddle.com.

I can't guarantee it will be any faster. But it's worth a try.

Paul Spiegel
  • 30,925
  • 5
  • 44
  • 53