-1

NOTICE TO THE MODS: DON'T DELETE/ DON'T CLOSE I asked this question earlier and the mods closed it because they thought it was similar to a question by another user. I have looked at the thread that they referred me to and it doesn't contain the kind of numeric problems that I have. That thread is How do I match an entire string with a regex?


My Question/Issue: REGEXP is returning a false positive.

SELECT '123456' REGEXP '[0-9]{1,4}' AS Test;

Based on what I've read, the part with the curly brace {1,4} means minimum of 1 occurrence and no more than 4. But from the above , the occurrence of the range [0-9] is more than 4 and yet the query returns a 1 instead of a 0. I've attached a screenshot. What am I missing? Thanks.

Screenshot of the example in Workbench

  • MySQL `REGEXP` returns true if the expression matches **part** of the text. – Bohemian Aug 02 '22 at 23:21
  • the linked question is exactly what you are missing. your example is matching because '123456' does indeed *contain* a 1-4 digit sequence. if you mean to check that the entire string *is* a 1-4 digit sequence, you need anchors, as shown in the answer to the other question. https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=1dc59aa8271985ad17e5e1a1b8fea045 – ysth Aug 02 '22 at 23:41
  • 1
    @ysth , I didn't know about anchors. Thanks for the dbfiddle link. Much appreciated. – Theo Fitchner Aug 03 '22 at 09:42

1 Answers1

0
SELECT '123456' REGEXP '^[0-9]{1,4}$' AS Test;

By "anchoring" you are asking to match the entire string. The above will fail because of the limit of 4.

SELECT '123456' REGEXP '^[0-9]{1,}$' AS Test;

Passes because it allows at least number of digits.

SELECT 'zzz123456' REGEXP '^[0-9]{1,}$' AS Test; -- Fail
SELECT '123456' REGEXP '^[0-9]*$' AS Test;  -- pass
SELECT '' REGEXP '^[0-9]{1,}$' AS Test;  -- fail (too short)
SELECT '' REGEXP '^[0-9]+$' AS Test;  -- same as {1,}
SELECT 'abc123456def' REGEXP '[0-9]{1,4}' AS Test; -- pass (no anchor)
SELECT 'abc123456def' REGEXP '^[^0-9]+[0-9]{1,4}[^0-9]+$' AS Test;  -- fail
SELECT 'abc123456def' REGEXP '[^0-9]*[0-9]+[^0-9]*' AS Test;  -- pass

Those last two include [^0-9], which means "anything except 0-9.

Elaboration on ^

  • At the beginning of the regexp, ^ "anchors" the processing at the beginning: REGEXP "^x" means "starts with x"; REGEXP "x" succeeds if "x" is anywhere in the string.
  • At the beginning of a "character set", ^ means "not": REGEXP "x[0-9]" looks for x followed immediately by a digit' REGEXP "x[^0-9]" looks for x not immediately followed by a digit.
Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Thanks for your explanations. They are very helpful. However, I'm confused about the true function of the caret symbol: ^ A video on Youtube by Sadia of the Crack Concepts channel, stated that the ^ symbol is used for negation. You said the same thing. However, the MySQL official documentation states that the symbol is used to denote the beginning of a pattern. The anchors example has also used it this way. So, it seems the ^ symbol has two functions or am I mistaken? – Theo Fitchner Aug 10 '22 at 17:48
  • The 6th example: SELECT 'abc123456def' REGEXP '[^0-9]*[0-9]{1,4}[^0-9]*' AS Test; will pass because the string expression fits the pattern specified. I have also tested it in dbfiddle and it returned a 1. – Theo Fitchner Aug 10 '22 at 17:53
  • @TheoFitchner - I added a discussion of the _two_ uses for caret. My bad in #6; I will fix it. – Rick James Aug 10 '22 at 18:08