1

I want to select all words like

baaac
czzzd
dkcvvvvz

They all have letters such ANY letters such as a,z, and v in them that are repeated more than twice in a sequence

Is that possible with REGEX? Here is how I query my table for three repetitions of letter a, I want to generalize this a little bit so that I don't have to query for every letter

select word FROM `allwords` WHERE word REGEXP '^.*aaa.*$'
Yasser1984
  • 2,401
  • 4
  • 32
  • 55
  • Your question is a bit unclear. Do you want to select words that have *any* repeated letters? Like "book"? How about "baby"? Repeated twice, three times, or does it matter? Do you want to consider *any* repeated letter, or just a, z, and v? – Greg Hewgill May 09 '12 at 04:41
  • @GregHewgill it says more than 2 in the title. But the question is somewhat unclear. – Derk-Jan May 09 '12 at 04:45
  • Sorry about the ambiguity, so more than 2, and in a sequence, book would not qualify, and boook would, baby would not qualify. – Yasser1984 May 09 '12 at 04:46
  • possible duplicate of [How can I find repeated characters with a regex in Java?](http://stackoverflow.com/questions/664194/how-can-i-find-repeated-characters-with-a-regex-in-java) – Phrogz May 09 '12 at 04:51
  • '(\\w)\\1+' returned empty set in mysql – Yasser1984 May 09 '12 at 04:56

4 Answers4

2

It looks like Mysql may support backreferences, which means you could do something like:

WHERE word REGEXP '([a-z])\1{2,}'

but you'd have to test that on your Mysql to be sure.

If you can't use backreferences and you still want to use regex, then you could do the following:

WHERE word REGEXP '(aaa|bbb|ccc|ddd|eee|fff|ggg|hhh|iii|jjj|kkk|lll|mmm|nnn|ooo|ppp|qqq|rrr|sss|ttt|uuu|vvv|www|xxx|yyy|zzz)'
Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
  • However, I note that the following feature request appears unresolved: http://bugs.mysql.com/bug.php?id=29590 – Greg Hewgill May 09 '12 at 04:51
  • fwiw, [PostgreSQL supports backreferences in regex](http://www.postgresql.org/docs/9.1/static/functions-matching.html) (section 9.7.3). – Greg Hewgill May 09 '12 at 04:54
0

I gave up and ran

delete FROM `allwords` WHERE word REGEXP '^.*aaa.*$';
delete FROM `allwords` WHERE word REGEXP '^.*bbb.*$';
delete FROM `allwords` WHERE word REGEXP '^.*ccc.*$';
...
delete FROM `allwords` WHERE word REGEXP '^.*zzz.*$';

It's a sad day

Yasser1984
  • 2,401
  • 4
  • 32
  • 55
0

Try this one:

SELECT * FROM mytable WHERE mycolumn REGEXP "^[a-z]*(([a-z]){2}){2,}[a-z]*$";
Cylian
  • 10,970
  • 4
  • 42
  • 55
0

To find whether a string contains more than 2 repeated alphanumeric then use this

/([a-zA-Z0-9])\1{2,}/.test(str) // it returns true if contains and false if not