6

I was trying to find the best way to search multiple columns for the occurence of a word. Ofcourse an option would be to add multiple OR's in the query:

SELECT * FROM table WHERE column1 LIKE %term1% OR column2 LIKE %term1% OR......

Yet I though this made the query quite long, especially when there are multiple terms

In my search I came across the CONCATfunction in MYSQL. This seems to allow me to make one search after concatenating the columns (right?):

SELECT * FROM table WHERE CONCAT(column1,column2,....) LIKE %term1% OR CONCAT(column 1,column2,...) LIKE %term2% ....

This has the advance of being easier to read and maintain

Because I do not yet understand the full use of CONCAT I do not know or this is a correct way to use it. I was wondering what is the correct and most efficient way to achieve this search. (note: I am also planning on having it sort as stated in this stackoverflow question: mysql SORT BY amount of unique word matches, maby there are problems while using a method with this?)

EDIT:

Maby it would be easier to just explain my final wish: In php i have an array of single words. I wish select all rows from my database where 1 of the words occurs in one of multiple colmuns and then sort them by relevance (or even add extra importance to some columns). (note that there are only a few thousands (less than 5000) rows to search)

Community
  • 1
  • 1
PSZ_Code
  • 1,015
  • 10
  • 29
  • 2
    Interesting. In either case finding matching rows will be slow (as it won't use indexes due to the leading wild cards in the LIKE). This might mean that doing a single like against a long string might manage to be quicker. Down side is is that if you search for the term ME if would also find rows where column1 ends with M and column2 starts with E – Kickstart Jul 22 '14 at 13:44
  • 1
    @Kickstart : What if the wildcards are skipped? The ME problem can be fixed by using concat_ws, no? – PSZ_Code Jul 22 '14 at 13:47
  • I'd stick to the first option but I have nothing to backup my point except that you may may get false result. Ex: column 1 : Rapido, column2 : Great you'll end up with RapidoGreat and you'll have a match for "dog". – ForguesR Jul 22 '14 at 13:49
  • 1
    Concat_ws fixes it until the user happens to search for a string containing the separator you have chosen to use. Problem with leading wildcards is that MySQL cannot use an index to find them (it possibly can with a trailing wild card). If there were no leading wildcards then using lots of ORs would allow indexes to be used, but concatenating the fields together would ensure the leading wildcard was still required. – Kickstart Jul 22 '14 at 13:50
  • most separators are filtered out in php before the query happens, so that would not be a problem. And how about some sort of fulltext search as Tata suggests? – PSZ_Code Jul 22 '14 at 13:52
  • My experience with full text searches is limited, and not really that successful. Either way a full text search is not going to be an option of individual columns concatenated together. Whatever you do, multiple LIKE statements with leading wildcards are going to perform badly. – Kickstart Jul 22 '14 at 14:03
  • I guess the leading wildcards can be dismissed though, they aren't that important. The ending wildcards i rather i have. So i guess without the leading wildcards I should go for the like? EDIT: seems i cannot skip the leading wildcards, it skips to many results – PSZ_Code Jul 22 '14 at 14:13
  • Separate LIKE statements allow you to drop the leading wild cards if not required, and thus allows the query to use indexes which should be far quicker. – Kickstart Jul 22 '14 at 14:18
  • 1
    "Sort by relevance" in the end of your post immediately causes "full text" as a solution. That's for what it's intended. But if you can not use it because your version of InnoDB doesn't support it yes - then, fine, best suggestion: upgrade your MySQL server to 5.6. Otherwise you'll be always around such kind of arcane solutions. From viewpoint of performance `OR LIKE` _may_ use index, but that's not guaranteed, in common case it won't. Thus, bith ways are bad and the only thing matter may be _readability_. And that's your decision to make. – Alma Do Jul 22 '14 at 14:24
  • It is not in my power to update the MySQL server – PSZ_Code Jul 22 '14 at 14:38

2 Answers2

3

Since i tought this was an interesting point, i tought the best would be good to give it a try.

Well, after your edit, it seems you don't have that much data, but i'll still post the result :


EDIT : This was my first try over MySQL testing, and, as @Alma Do said, those results "may be just fluctuation".


First query (on 90000 rows) road and road2 are VARCHAR(100):

SELECT * FROM adress WHERE CONCAT(road1,road2) LIKE '%test%'; 0.0503 sec

SELECT * FROM adress WHERE road1 LIKE '%test%' OR road2 LIKE '%test%'; 0.0710 sec

It seems indeed multiple LIKE is much more resourcefull. I'm not that good with MySQL resource verification, so i didn't go further on this, guess it should still give an idea.

The main problem i see is if CONCAT could create wrong matches :

SELECT CONCAT('fooa','bfoo') with LIKE '%ab%' would work, while 'fooa' LIKE '%ab% OR 'bfoo' LIKE '%ab% OR wouldn't.

For 500 rows, i don't think you would have resource issues, so i would have just gone with what 'has the advance of being easier to read and maintain'.

Hope it helps

Meeuuuhhhh
  • 388
  • 5
  • 10
  • 1
    This measurements are just numbers which are near senseless. Real situation may depend of many cases. For instance, first column may always contain desired word, thus, second `LIKE` will never be executed (short circuit). Or, otherwise, if first column does not contain that word, then both `LIKE` will be executed for each row. The only proper point is - word overlapping edge-case, but that may be fixed by using delimiter. However, 90.000 also is just too small amount to look seriously in testing results. 0.01..0.05 sec may be just fluctuation. – Alma Do Jul 22 '14 at 14:43
  • Thx for the tips, this was actually my first time trying to do some MySQL testing, wanted to try to see if i could do something, but indeed, lots of details started to be a problem... I believe they may not be that precise indeed, but well, hope it still can helps :) – Meeuuuhhhh Jul 22 '14 at 14:48
  • Good enough for me – Martin Zvarík Dec 20 '18 at 21:51
  • Thumbs up for the "has the advance of being easier to read and maintain" ! – Herbert Van-Vliet Feb 18 '21 at 10:45
  • By the way: See if you can use CONCAT_WS instead of CONCAT. – Herbert Van-Vliet Feb 18 '21 at 11:59
0

The best way is to use full text search engine for that task, or use fulltext capabilities of MySQL.

you can find more explanation about it here: fulltext-natural-language

Tata
  • 802
  • 9
  • 19
  • but using a INNOdb table and not being sure mysql 5.6 is being used, this can't be used – PSZ_Code Jul 22 '14 at 13:48
  • in that case - use external fulltext search engine. – Tata Jul 22 '14 at 13:52
  • it seems i've not yet heard of that. Could you explain? – PSZ_Code Jul 22 '14 at 13:53
  • you can use external fulltext engine, for example Solr, Sphinx or Elasticsearch. Then you should define the queries that will pull the data from your DB every X minutes by using a key. and perform all your searches there, and not in mysql. – Tata Jul 27 '14 at 12:57