I have a large dataset with 5M rows. One of the fields in the dataset is 'article_title', which I'd like to search in real-time for an autocomplete feature that I'm building on my site.
I've been experimenting with MySQL and MongoDB as potential DB solutions. Both perform well when an index is used, for example for 'something%', but I need to match titles within a string, as in '%something%'.
Both MySQL and MongoDB took 0.01 seconds with an index using forward-looking search, and about 6 seconds with a full string search.
I realize that the entire DB needs to be scanned for a string-in-string type search so what is the common approach to this problem? Solr and Sphinx seem like overkill for this one problem so I'm trying to avoid using them if possible.
If I got a box with 2 GB of RAM and a 40GB SSD (which is what I can afford at the moment), would I be able to get sub-second response time? Thanks in advance.
--
UPDATE: I tried a fulltext index and while the results are very fast, it doesn't really satisfy a string-in-string search ("presiden" doesn't match "president"). I'm looking for ways to match a string-in-string with a 5M row dataset.