I need to search a text (around 500 words long) for words in an English dictionary (around 275,000 keywords) to detect non-English words and right now the query I am using is not really optimized which takes more than 10 seconds to execute (there's a words
table and a texts
table):
SELECT word FROM words WHERE 'The quick brown fox jumps over the lazy dog' LIKE CONCAT( '%', word, '%' );
Got the idea from here.
I have already set the word
field as an Index and seen some examples of people storing the text in the database or putting it directly in the query.
Other examples showed people using FULLTEXT search although having 300k words I don't think a FULLTEXT will work, I guess it's good to search with logic +brown +lazy -apple
but in my case I don't need much logic.
Another example I've seen is to concate words with the IN (...)
clause although having 500m keywords the query would just be insanely long.
Any ideas what to do?
Right now the text is saved as a text
field and the words as varchar(50)
in InnoDB with utf8_unicode_ci
encoding, I've heard InnoDB is slow so I could use MyISAM or any other. I am using MySQL 5.5 although I could update to 5.6 if that helped.