I have in a table called institutions
containing the names of institutions. Each time a new name come in, a function needs to find any matches with respect to the existing institutions. For instance, we have the following institutions table:
NAME
FORMULA VIEW UNIVERSITY RESEARCH
FOURMULA VULCAN COLLEGE INSTITUTE
MOUNT VU FOOD GROWERS
FORMULA VU CAFE SHOP
The following synonyms table:
WORD SYN_LIST
EDUCATION SCHOOL, UNIVERSITY, COLLEGE, TRAINING
RESTAURANT BAR, BISTRO, BREAKFAST, BUFFET, CABARET, CAFE, FOOD, GALLEY, GASTROPUB, GRILL
STORE CONVENIENCE, FOOD, GROCER, GROCERY, MARKET, MART, SHOP, STORE, VARIETY
REFRIGERATION APPLIANCE, COLDSTORAGE, FOODLOCKER, FREEZE, FRIDGE, ICE, REFRIGERATION
Finally, the word substitution table:
WORD SUBS_LIST
MOUNTAIN MOUNTAIN, MOUNT, MT, MTN
VIEW VIEW, VU
FORMULA FORMULA, 4MULA, FOURMULA
In top of that, the name from institutions is classified as name|field
. To accomplish the classification of a word as name|field
is just needed to pass manually to the function using two input parameters. For instance, the name FORMULA VIEW UNIVERSITY RESEARCH
is decomposed as FORMULA VIEW
as name and UNIVERSITY RESEARCH
as field. When searching for a name in institutions, the name component (in this case FORMULA VIEW
has to find an exact match) and at least one of the field words (in this case UNIVERSITY
or/and RESEARCH
).
The previous match rule would look like as follows:
select
from institutions
where name like '%FORMULA VIEW%' || ' %UNIVERSITY% '
UNION
select
from institutions
where name like '%FORMULA VIEW%' || ' %RESEARCH% '
Then, if a new name comes in such as FORMULA VIEW COLLEGE
would be enough to find match with FORMULA VIEW UNIVERSITY RESEARCH
.
Finally, the synonym and word substitutions apply to the received name and each word can be in more than one category in the synonym table (as the word food is included in store and restaurant), then a varray could be needed in the function to find matches. Considering the previous three tables consider a new name comes such as MOUNTAIN VIEW STORE OFFICE
, then the function finds MOUNT VU FOOD GROWERS
already in institutions table as a match. This match occurs because food has two synonyms, then two words are generated: MOUNTAIN VIEW FOOD GROWERS
and MOUNTAIN VIEW STORE GROWERS
.
select
from institutions
where name like '%MOUNTAIN VIEW%' || ' %STORE% '
UNION
select
from institutions
where name like '%MOUNTAIN VIEW%' || ' %OFFICE% '
Even though just STORE
is a match and not OFFICE
, this is enough to identify as a match and indicate a conflict in STORE
.
I have been working on this problem for a while and created this DBFiddle to include most of the part of the previous explanation and attempting to solve the problem, but I have not been able to figure out how add the logic of searching for exact match in name
category and pass one field
category until passing the remaining field
categories like in the example:
select
from institutions
where name like '%MOUNTAIN VIEW%' || ' %STORE% '
UNION
select
from institutions
where name like '%MOUNTAIN VIEW%' || ' %OFFICE% '
Please let me know if you find another solution more suitable to approach this problem. I hope I was clear enough, please let me know any questions.