2
create table tbl (
  id int,
  comment varchar(255),
  primary key (id)
);

insert into tbl (id, comment) values ('1', 'dumb,');
insert into tbl (id, comment) values ('2', 'duuumb,');
insert into tbl (id, comment) values ('3', 'dummb');
insert into tbl (id, comment) values ('4', 'duummb');
insert into tbl (id, comment) values ('5', 'very dumb person');

select comment, soundex(comment) 
from tbl;

Result:

+------------------+------------------+
| comment          | soundex(comment) |
+------------------+------------------+
| dumb,            | D510             |
| duuumb,          | D510             |
| dummb            | D510             |
| duummb           | D510             |
| very dumb person | V6351625         |
+------------------+------------------+

I want to find all rows containing 'dumb', including all typos and variations, anywhere in the field.

select comment 
from tbl
where soundex(comment) like '%D510%'

This fails to get the final row #5, how can I also get that row? If there is a better solution than soundex() that would be fine.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
davidjhp
  • 7,816
  • 9
  • 36
  • 56
  • Do you also need to find `very duuumb person`? – Barmar Feb 13 '14 at 21:46
  • yes every possibility, which soundex() can do – davidjhp Feb 13 '14 at 22:30
  • See http://stackoverflow.com/questions/1096679/can-mysql-split-a-column for how to split a string into multiple rows at a delimiter. You can then use this as a subquery in your original query. – Barmar Feb 13 '14 at 22:39

3 Answers3

4

This will work for your particular example:

select comment 
from tbl
where soundex(comment) like '%D510%' or comment like '%dumb%';

It won't find misspellings in the comment.

EDIT:

You could do something like this:

select comment
from tbl
where soundex(comment) = soundex('dumb') or
      soundex(substring_index(substring_index(comment, ' ', 2), -1)  = soundex('dumb') or
      soundex(substring_index(substring_index(comment, ' ', 3), -1)  = soundex('dumb') or
      soundex(substring_index(substring_index(comment, ' ', 4), -1)  = soundex('dumb') or
      soundex(substring_index(substring_index(comment, ' ', 5), -1)  = soundex('dumb');

A bit brute force.

The need to do this suggests that you should consider a full text index.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

Can you try with MySQL REGEXP? Is a good solution to find a specific word into text.

You can use [[:<:]] and [[:>:]] as word boundaries:

SELECT comment FROM tbl WHERE comment REGEXP '[[:<:]]dumb[[:>:]]'
0

Assuming you want the whole field and not just the matching element then this should work as soundex will never find an element in the middle of an other element;

select comment from tbl where (soundex(comment) like '%D510%' or comment like '%d%mb')

Edit Changed like for cases where the U is something else ie Damb which gives a soundex of D510

Steve
  • 388
  • 1
  • 3
  • 14
  • This won't find `very duuumb person` – Barmar Feb 13 '14 at 22:35
  • True. Changed the last like. Unless you broke the search string into parts and did a soundex match on the parts which could get very messy :) a like to capture d%mb would work as it's the dmb that gives a soundex of D510 – Steve Feb 13 '14 at 23:45