-1

I need to search for keywords, which is case insensitive. To do that, I'm using the below queries. Logic wise okay, but performance drastically going down.

Table info

item_tbl: 558991075
keywords: 2000
SELECT itemname from items i
LEFT JOIN keywords k
ON i.id = k.item_id
AND lower(i.itemname) LIKE CONCAT('%', lower(k.value), '%')
WHERE l.item_id is null

Example:

    Keywords:
    Iphone x
    Yellow Mango
    Red onion
    
    Item names:
    Some Item 1
    Red color Iphone X
    Big Yellow Mango

The final valid item is Some Item 1.

Is there a way to improve this query performance?

philipxy
  • 14,867
  • 6
  • 39
  • 83
KKK
  • 1,652
  • 7
  • 29
  • 49
  • I am not sure this is syntactically correct, could you be missing an 'AND' in this statement? – Matt Andruff Aug 10 '22 at 12:53
  • 1
    I think you are missing some table information it would definitely help to have more information to better help you. – Matt Andruff Aug 10 '22 at 12:55
  • [How to Optimize Queries in a Database - The Basics](https://stackoverflow.com/q/3191623/3404097) [What are your most common sql optimizations?](https://stackoverflow.com/q/1332778/3404097) [When and why are database joins expensive?](https://stackoverflow.com/q/173726/3404097) [Tips for asking a good SQL question](https://meta.stackoverflow.com/q/271055/3404097) [Asking query performance questions](https://dba.meta.stackexchange.com/q/3034/43932) [mre] – philipxy Aug 11 '22 at 01:36
  • What do you mean, "the final item" & what is the significance of that one being it? – philipxy Aug 11 '22 at 01:42

1 Answers1

1

What is likely taking the most time is the like. If you are going to regularly use this query and it's not a one time thing you should try and precompute it. Or you should try to make this a straight forward join.

Is there a delimiter you can use to reduce the amount of items that you compare with like. You basically have to compare every record to every other record when you use like.

  • split i.itemname into 'words' something like explode(split(i.itemname, ' ')) as words
  • split the k.value to get at least potential match largely reducing the words you evaluate
  • join on 'words' matching lower(split(k.value, ' ')[0]) = lower(words)
  • finally the actual match you wish to make where lower(i.itemname) LIKE CONCAT('%', lower(k.value), '%')

This would enable the power of a join to send the data to the right reducer and reduce the amount of comparisons.

Matt Andruff
  • 4,974
  • 1
  • 5
  • 21
  • I tried and it seems it's not working as expected. The reason is, the keywords table needs an exact match (case insensitive) with item_name. I've added an example in the question – KKK Aug 11 '22 at 01:00
  • 1
    I have further updated the answer to encompass the new information – Matt Andruff Aug 11 '22 at 13:10