My use case is following:
I have a file which contains columns placeName
and country
. Now given an input place name, I want to find the country. Since place name that might be passed won't be exact match, I wanted to use %LIKE%
kind of matching.
Sample data in file:
PlaceName,Country
Los Angeles,US
Las Vegas,US
Portland,US
Input file:
Place1
place2
Input file which has around 50 million place names. Iterating one by one on that input and then another loop on above mapping in file 1 and doing string matching will be very inefficient. It is O(n*m) where n is number of input palces and m is places in mapping file.
One obvious way is to load this data into database and use it from there. But list of input places is very big (~50 million) so querying database again and again would be very inefficient.
Second approach that I was thinking of - Implement some data structure which will load all data from file and allow 'like' operations in memory. But I don't seem to be able to find it.
Last option I can think of is to - use database only as in approach 1 but use caching. Is there a way to tell framework like Hibernate
to load all data from table in cache and do all 'like' operation there?
Any suggestions on possible solutions?