-2

I tested so much things to get my Query the fastest as i can.

The table has over 2.9 Million Rows. And thats the Query

SELECT id, name, price, pricebefore, link, imagelink, updated, site, siteid 
    FROM items 
    WHERE (case when pricebefore3 is NULL then pricebefore else pricebefore3*1.5 end) >= pricebefore 
    AND price < pricebefore 
    AND isbn != -1 
    AND 1 = CASE
      WHEN (100-price/pricebefore*100) > 90 THEN updated < NOW() - INTERVAL ".$timeago." MINUTE
      ELSE (100-price/pricebefore*100) > ".mysqli_real_escape_string($link, $percentage)." 
    END
    ".preg_quote($merchant)."
    AND name LIKE '%".mysqli_real_escape_string($link, $search)."%' 
    ORDER BY updated DESC 
    LIMIT ".mysqli_real_escape_string($link, $perPage)." 
    OFFSET ".mysqli_real_escape_string($link, $site);

Explain the variables:

  • $timeago can be 0 or like 30 to get only items older than 30 minute

  • $percentage can be like 70 to show only items with more than 70 percentage on price difference

  • $merchant is that AND link REGEXP '".$merchant."' and the $merchant can look like that amazon.de/gp/product|idealo.de to get only items with that in the link (so only items from amazon.de and idealo.de

  • $search can be a keyword that includes in the name

  • $perPage and $site is for pagination. so 20 items per page, site 1

I tried so much different things but doesnt find a good way. There is also a way without $search line when $search is not set (so no one is searching specific item)

$merchant is also ony set when user has something in $merchant otherwise its that $merchant = '';

does anyone have an idea to make that faster? the sql is running with mysqli_connect just simple php file

and one more thing...

if i have this in regex, it is faster than the second one. But why?!

AND link REGEXP 'amazon.de/gp/product|idealo.de'
AND link REGEXP 'idealo.de'

The second is 2 secs slower than the first?!

Barmar
  • 741,623
  • 53
  • 500
  • 612
tevved
  • 25
  • 2
  • Please [edit] your question to add the EXPLAIN execution plan. – Ken White Oct 20 '21 at 20:12
  • 2
    All those calculations in the `WHERE` block make it nearly impossible to use indexes. – Barmar Oct 20 '21 at 20:13
  • and ise prepared statements, every it is posible see https://stackoverflow.com/questions/60174/how-can-i-prevent-sql-injection-in-php – nbk Oct 20 '21 at 20:13
  • @Barmar i know... but what can i do to use that WHERE... i mean i need them to get correct things... but whats really strange, its only that crazy slow, when i have `AND link REGEXP 'idealo.de'` if i have `AND link REGEXP 'amazon.de/gp/product|idealo.de' ` its faster... but i dont know why... thats one regexp vs two and its slower?! – tevved Oct 20 '21 at 20:18
  • looks like that is this "idealo.de" thing.. but any ideas why? with multiple regex works, only amazon.de also works good, but only idealo.de is slow – tevved Oct 20 '21 at 20:19
  • Why are you using `preg_quote()`? That prevents `|` from doing alternation, it's matched literally. – Barmar Oct 20 '21 at 20:24
  • The longer regexp may be faster because it can return false quickly whenever `link` is too long to match. – Barmar Oct 20 '21 at 20:26
  • @Barmar Oh ok, good to know. Was my idea to prevent some injections. But did you have an idea why regexp is slower with just `idealo.de` in it? with `'amazon.de/gp/product|idealo.de'` it is faster than only idealo.de wtf Did you have an idea how to print multiple links, like all amazon.de and all idealo.de? – tevved Oct 20 '21 at 20:27
  • I don't think there's any way to speed this up. It can't use indexes, so it has to scan the entire table. – Barmar Oct 20 '21 at 20:28
  • If you need to filter links based on the domain, you should add a column that just contains the domain, so you can match it directly instead of using a pattern. – Barmar Oct 20 '21 at 20:29
  • Thats so strange... if i search just %idealo.de" as a like than its slower than %amazon.de% as a like... – tevved Oct 20 '21 at 20:29
  • **You need to show us the table and index definitions**, as well as row counts for each of the tables. Maybe your tables are defined poorly. Maybe the indexes aren't created correctly. Maybe you don't have an index on that column you thought you did. Without seeing the table and index definitions, we can't tell. We need row counts because that can affect query planning. If you know how to do an `EXPLAIN` or get an execution plan, put the results in the question as well. If you have no indexes, visit http://use-the-index-luke.com. – Andy Lester Oct 20 '21 at 20:41
  • please edit your question to show (as text, not images) output of `select version()`, `show create table items`, and `EXPLAIN SELECT rest-of-your-query`. – ysth Oct 20 '21 at 20:57
  • is there always a merchant? if so, are they arbitrary or is there some fixed set of possible merchants? – ysth Oct 20 '21 at 20:59
  • Why do you bring up `AND link REGEXP ...`; it is not even in the query? – Rick James Oct 21 '21 at 04:42

1 Answers1

0

If idealo.de is the domain, then have a column for domain and test it with =, not LINK '%idealo.de%', nor REGEXP 'idealo.de'. And have

INDEX(domain, updated)
Rick James
  • 135,179
  • 13
  • 127
  • 222