1

I am using the following query (simplified for here) to check if a string contains a "watch-word" where the watch words are contained in a MySQL table:

$sql = "SELECT ww_id FROM watch_words WHERE ww_word IN (" . $string . ")";

This works perfectly for single words, but now I need to make it work for phrases (i.e. the field ww_word may contain more than one word). All I can think of are things like reading the whole table into an array and then doing multiple loops to compare against combinations of the words in the string, but I'm sure (hoping) there's a better way.

EDIT: Thanks for the suggestions, but as pointed out by Mike Brant, the needle is in MySQL and the haystack in PHP - not the "usual" way around (like a search form for instance). I need to check if a string (actually a message) contains one or more "watch phrases" - like a bad-language filter (but not that).

Sample table thus:

CREATE TABLE `watch_words` (
  `ww_id` int(11) NOT NULL AUTO_INCREMENT,
  `ww_word` varchar(250) NOT NULL,
  PRIMARY KEY (`ww_id`)
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=latin1 AUTO_INCREMENT=6 ;

INSERT INTO `watch_words` VALUES (1, 'foo bar');
INSERT INTO `watch_words` VALUES (2, 'nice sunny day');
INSERT INTO `watch_words` VALUES (3, 'whatever');
INSERT INTO `watch_words` VALUES (4, 'my full name here');
INSERT INTO `watch_words` VALUES (5, 'keyword');

So string "What a nice sunny day we're having" should return a match, whereas "What a lovely sunny day..." wouldn't. TIA.

da5id
  • 9,100
  • 9
  • 39
  • 53

2 Answers2

3

use LIKE for pattern matching

$sql = "SELECT ww_id FROM watch_words WHERE ww_word LIKE '%" . $string . "%'";

or maybe interchange the two,

$sql = "SELECT ww_id FROM watch_words WHERE " . $string . " LIKE CONCAT('%', ww_word,'%')";

As a sidenote, the query is vulnerable with SQL Injection if the value(s) came from the outside. Please take a look at the article below to learn how to prevent from it. By using PreparedStatements you can get rid of using single quotes around values.

Community
  • 1
  • 1
John Woo
  • 258,903
  • 69
  • 498
  • 492
  • Also worth noting that leading with a wildcard ( $ _ ) in a LIKE prevents the use of indexes so this could impact performance on a large table. – twoleggedhorse Jan 24 '13 at 01:17
  • I'd also recommend looking into MATCH AGAINST. If you have multiple words with LIKE you would need to do something like `ww_word` LIKE '%". $word1 ."%' OR `ww_word` LIKE '%". $word2 ."%' etc – Kai Qing Jan 24 '13 at 01:17
  • Thanks, I will test. I thought this would only work the other way around though (i.e. if the database contained the string and the phrase was searched for). I am searching messages to see if they contain certain phrases (like a bad-word alert kind of thing, but not that). – da5id Jan 24 '13 at 01:19
  • @da5id what is the engine you are using? If it's `MyISAM`, you can take advantage of `FULL TEXT SEARCH`. – John Woo Jan 24 '13 at 01:20
  • @KaiQing yes, if the engine supports `FULL TEXT SEARCH` which is better. – John Woo Jan 24 '13 at 01:20
  • It's InnoDB, but I think the issue is as Mike Brant says. I'm going to do some testing & will come back... – da5id Jan 24 '13 at 01:23
  • This doesn't work. As Mike Bryant says, the problem is the other way around. I'm going to edit my post to clarify. – da5id Jan 24 '13 at 01:30
  • @da5id how abouyt interchanging the two? i updated the answer. – John Woo Jan 24 '13 at 01:33
  • Thanks but it still doesn't work. I'm beginning to think I'm going to have to put together a brute-force PHP looping solution :( – da5id Jan 24 '13 at 01:41
  • can you give sample records on your table and possible value for `$string` ? – John Woo Jan 24 '13 at 01:45
1

You will likely need to take a different approach here. You have the needle in MySQL and the haystack in PHP. Using things like LIKE (which you use for string matches not IN), MySQL can work fine with the haystack being in MySQL table and the needle in the application (in the LIKE).

There is no convenient reverse matching to pass MySQL the haystack and have it apply a needle from a field in a table against it.

You will likely need to select your needles out of the database and compare it to the haystack in your application.

Mike Brant
  • 70,514
  • 10
  • 99
  • 103