1

SCENARIO:

I have one large table (let's call it "WordTable") with a list of words (let's call the field "theWord") that could have 10,000+ records.

I also have a large table (let's call it "MySentences") with a VARCHAR field (let's call the field "theSentence") that contains many varied sentences - it could have millions of records.

QUESTION: What SQL could I write for the MySQL database to give me a list of which records in MySentences.theSentence contain any of the words from WordTable.theWord ?

Since there are many records in both tables, using numerous Like statements is not feasible. Would FullText Search allow some capability here?

Hopefully this helps... by the way, a "sentence" does not always need to have spaces... it could just be a collection of letters

Here are some MySQL scripts to illustrate the scenario:

CREATE TABLE `MySentences` (
  `id` int(11) NOT NULL,
  `theSentence` varchar(1000) NOT NULL
) ENGINE=InnoDB AUTO_INCREMENT=8 DEFAULT CHARSET=latin1;

INSERT INTO `MySentences` (`id`, `theSentence`) VALUES
(1, 'hereisatestsentence'),
(2, 'asdfasdfadsf'),
(3, 'today is a blue sky'),
(4, 'jk2k2lkjskylkjdsf'),
(5, 'ddddddd'),
(6, 'nothing'),
(7, 'sometest');

CREATE TABLE `WordTable` (
  `id` int(11) NOT NULL,
  `theWord` varchar(50) NOT NULL
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=latin1;

INSERT INTO `WordTable` (`id`, `theWord`) VALUES
(1, 'test'),
(2, 'house'),
(3, 'blue'),
(4, 'sky');

ALTER TABLE `MySentences`
  ADD PRIMARY KEY (`id`);

ALTER TABLE `WordTable`
  ADD PRIMARY KEY (`id`);

ALTER TABLE `MySentences`
  MODIFY `id` int(11) NOT NULL AUTO_INCREMENT,AUTO_INCREMENT=8;

ALTER TABLE `WordTable`
  MODIFY `id` int(11) NOT NULL AUTO_INCREMENT,AUTO_INCREMENT=5;
laylarenee
  • 3,276
  • 7
  • 32
  • 40

2 Answers2

1

I made a query using the LIKE operator in the JOIN clause which will find any sentence that contains word. The LIKE operator uses wildcards % which will match anything.

SELECT
    A.theSentence, B.theWord
FROM
    MySentences A
    INNER JOIN WordTable B ON A.theSentence LIKE CONCAT('%',B.theWord,'%');

If you are interested in just the sentence that was matched, you could use the DISTINCT operator to see distinct results:

SELECT
    DISTINCT A.theSentence
FROM
    MySentences A
    INNER JOIN WordTable B ON A.theSentence LIKE CONCAT('%',B.theWord,'%');
laylarenee
  • 3,276
  • 7
  • 32
  • 40
0

You split your string into rows using something like this

SQL split values to multiple rows

You need a separator char probably space _, but also be carefull, may have to remove special chars like , . : ;

Then you join that result to your WordTable and find which words are there.

Community
  • 1
  • 1
Juan Carlos Oropeza
  • 47,252
  • 12
  • 78
  • 118