2

I have a table that has about 3000 different key-phrases (small strings). I need a query that takes the input of a large-string (like a large sentence or paragraph) and returns the records of key-phrases that exist inside the large-string.

Simplified example: Table has 5 records: "fishing", "pond", "cloud", "big one", "house".

Large-string input: "I have a pond at my house. I decided to go fishing and caught a big one."

Query outputs (matched records): pond, house, fishing, big one

Thank you, James

I tried a few select statements to no avail. Maybe I need to parse the large-string into some kind of array before making a statement/query?

dougp
  • 2,810
  • 1
  • 8
  • 31
  • 2
    which databse system are you using? – nbk Dec 14 '22 at 22:19
  • For PostgreSQL you may use this question: https://stackoverflow.com/questions/1986491/sql-split-string-by-space-into-table-in-postgresql – Renat Dec 14 '22 at 22:21
  • For SQL Server I suggest using full text search https://learn.microsoft.com/en-us/sql/relational-databases/search/full-text-search?view=sql-server-ver16 – Nick.Mc Dec 14 '22 at 22:42
  • Sorry I didn’t specify platform. I’m using mySQL with PHP. – James Johnson Dec 15 '22 at 01:53

2 Answers2

2
with small (term) as (
        select 'fishing'
  union select 'pond'
  union select 'cloud'
  union select 'big one'
  union select 'house'
)

select term
from small
where 'I have a pond at my house. I decided to go fishing and caught a big one.' 
  REGEXP concat('(^|.*[^a-z])', term, '([^a-z].*|$)') = 1
dougp
  • 2,810
  • 1
  • 8
  • 31
  • a 'cloudy', 'ponderous', 'housemate' would be a match – JJ32 Dec 15 '22 at 03:08
  • Good catch. Thanks. I updated my answer for that and for the new requirement that this is for MySQL – dougp Dec 15 '22 at 16:23
  • Thank you for this. Unfortunately, it turns out the db is MariaDB, so I'm not able to use REGEXP_LIKE. – James Johnson Dec 15 '22 at 22:52
  • Please use correct tags including software version where possible. If you can't tag the question with the right version number, state it in the question. – dougp Dec 16 '22 at 17:02
  • I updated my answer after spending about 20 seconds reading MariaDB docs. – dougp Dec 16 '22 at 17:03
0

You can use the FULLTEXT index. No matter how large your string input is.

Then use the following:

SELECT * FROM `table_name` 
WHERE MATCH(records) AGAINST ('"your large string"' IN BOOLEAN MODE) AS relevance;

For more see these answers:

ahmed
  • 9,071
  • 3
  • 9
  • 22
Grvito
  • 13
  • 3