0

I have large table with 615 million records in items and 10 thousand in markets

the below query takes 5+ seconds to execute

how can I make it faster. ?

SELECT items.name,items.ID as itemID,items.market_id as market_id,markets.name as marketname 
from items,markets 
where markets.ID = items.market_id and 
items.name like '%bab%' 
group by items.name 
order by items.name ASC 
limit 0,10

Explain

enter image description here

-> Limit: 10 row(s)  (actual time=4657.094..4657.096 rows=9 loops=1)

-> Sort: items.`name`, limit input to 10 row(s) per chunk  (actual time=4657.093..4657.094 rows=9 loops=1)

-> Table scan on <temporary>  (actual time=0.002..0.003 rows=9 loops=1)

-> Temporary table with deduplication  (cost=1141592.47 rows=625722) 
(actual time=4657.070..4657.072 rows=9 loops=1)

-> Nested loop inner join  (cost=1141592.47 rows=625722) (actual time=0.056..4541.471 rows=85329 loops=1)

-> Filter: (items.`name` like \'%bab%\')  (cost=593647.14 rows=625722) (actual time=0.043..4456.394 rows=85329 loops=1)

-> Table scan on items  (cost=593647.14 rows=5632061) (actual time=0.037..2864.754 rows=6153656 loops=1)

-> Single-row index lookup on markets using PRIMARY (ID=items.market_id)  (cost=0.78 rows=1) (actual time=0.001..0.001 rows=1 loops=85329)
Luuk
  • 12,245
  • 5
  • 22
  • 33
skcrpk
  • 558
  • 5
  • 18
  • Please add info from `EXPLAIN ANALYZE ` – Luuk Apr 01 '21 at 17:02
  • 2
    Your query is likely invalid assuming that `items.name` is not the primary key. Also, you have a `LIKE` condition in your where clause which is not sargable. This will make it hard to optimize your query. And finally, I don't see the point of using `GROUP BY` here, since you select no aggregates. – Tim Biegeleisen Apr 01 '21 at 17:04
  • From the picture (and from the comment Tim B. gave) it was already clear that `Table scan on items` is the cause of your problem. – Luuk Apr 01 '21 at 17:17
  • @Luuk so how do I re write the query to make it faster ? – skcrpk Apr 01 '21 at 17:19
  • See https://stackoverflow.com/questions/5905125/mysql-improve-search-performance-with-wildcards the reason for a full table scan is you are doing a "contains" that mysql cannot utilize the index when you have aleading wildcard. How long does `like 'bab%'` take (this is a starts with) ? – Rippo Apr 01 '21 at 17:24
  • @Rippo its takes 2+seconds which is better but not ideal – skcrpk Apr 01 '21 at 17:44
  • @Rippo will FTS be better solution ? change name column to text and do a match ? – skcrpk Apr 01 '21 at 17:47
  • @skcrpk did you read the answer to the link I sent, any FTS using a leading wildcard is goinf to take a long time, even removing it still takes time on 165M rows. – Rippo Apr 01 '21 at 19:14
  • @Rippo yes I did read that and tried to implement I was able to change the column to text and add a full text index it took long but result still took 2 seconds same as using ‘bab%’ as it utilises indexs in this case – skcrpk Apr 01 '21 at 19:32
  • The query is poorly formed if there are multiple rows for a given `name`. The query would not even complete on 8.0. To phrase it another way, _which_ `market_id` and items.ID` do you want to display when you group by items.name? – Rick James Apr 02 '21 at 03:10
  • @RickJames I will add tables schema in few hours so you can suggest a better approach ? – skcrpk Apr 02 '21 at 03:14
  • `MIN(market_id)`? AVG(market_id) (I doubt it)? GROUP_CONCAT(market_id)? – Rick James Apr 02 '21 at 03:16

2 Answers2

0

You might get a (small !?) improvement, when you do this:

SELECT items.name,items.ID as itemID,items.market_id as market_id,markets.name as marketname 
from items
inner join markets on markets.ID = items.market_id
where markets.ID IN (SELECT items.market_id 
                     FROM marketd
                     WHERE items.name like '%bab%' 
                     order by items.name ASC 
                     limit 0,10)
order by items.name

EDIT: oops, that's tru you cannot do LIMIT in a subquery....

If you have MySQL 8.0+, you can do:

WITH cte AS (
   SELECT items.market_id 
   FROM marketd
   WHERE items.name like '%bab%' 
   order by items.name ASC 
   limit 0,10
)
SELECT items.name,items.ID as itemID,items.market_id as market_id,markets.name as marketname 
from items
inner join markets on markets.ID = items.market_id
where markets.ID IN ( SELECT market_id 
                      FROM cte )
order by items.name;

also, you might need to add an index on market_id, if that does not exists yet. With that, the cte will use that index.

Luuk
  • 12,245
  • 5
  • 22
  • 33
0

This should work with any version of MySQL at least since 5.0:

SELECT items.name,
       items.ID as itemID,
       items.market_id as market_id,
       ( SELECT markets.name 
              FROM markets WHERE markets.ID = items.market_id
       ) as marketname 
    FROM items
    WHERE items.name like '%bab%' 
    ORDER BY items.name ASC 
    LIMIT 10

That should eliminate the GROUP BY.

Indexes:

items:  (name, ID, market_id)
markets:  presumably `ID` is the PRIMARY KEY
Rick James
  • 135,179
  • 13
  • 127
  • 222
  • it returns same items.names but when I add group by to your query it returns different items.name , which is what I want , I have created suggested index but that is not being used , still query takes more than 4+ seconds with group by and without it takes 3+ seconds – skcrpk Apr 02 '21 at 01:18
  • @skcrpk - by "same items.names", you mean duplicates? – Rick James Apr 02 '21 at 02:57
  • Not duplicate each market can have same name items – skcrpk Apr 02 '21 at 02:58
  • Then which market_id should be displayed after you group by the name? – Rick James Apr 02 '21 at 03:11