1

I need to write a regex in PHP, the condition is A-Z, a-z, 0-9, :, -, _, but in the end of string cannot be :

This is what I tried

<?php

$strings = [
    'aaa:bbb-cool',
    'aaa:bbb-cool-',
    'aaa-22-bbb_cool3',
    'aaa:bbb-cool:',
    'aaa_bbb-cool:',
    'aaa_bbb-cool',
    'bbbb:>dd',
    'hihi%',
    '大家好',
    '0000000000',
    '11111:2222:3333',
    '11111:2222:3333:',
    'DDD@@@1',
    '大家好',
];

$pattern = '/[0-9a-zA-Z]+$/i';

foreach ($strings as $key => $string) {
    var_dump('number '.$key .' '. $string.' is '.preg_match($pattern, $string));
}

and the result is

string(26) "number 0 aaa:bbb-cool is 1"
string(27) "number 1 aaa:bbb-cool- is 0"
string(30) "number 2 aaa-22-bbb_cool3 is 1"
string(27) "number 3 aaa:bbb-cool: is 0"
string(27) "number 4 aaa_bbb-cool: is 0"
string(26) "number 5 aaa_bbb-cool is 1"
string(22) "number 6 bbbb:>dd is 1"
string(19) "number 7 hihi% is 0"
string(23) "number 8 大家好 is 0"
string(24) "number 9 0000000000 is 1"
string(30) "number 10 11111:2222:3333 is 1"
string(31) "number 11 11111:2222:3333: is 0"
string(22) "number 12 DDD@@@1 is 1"
string(24) "number 13 大家好 is 0"
  1. I know I do not really prevent ":" in the end, cause number 1 should be true, how to make it right
  2. why is number 6 and number 12 be true?
Chan
  • 1,947
  • 6
  • 25
  • 37

1 Answers1

3

You are not preventing, nor matching the : with your pattern. To match :, the consuming part should contain : char. Strings 6 and 12 match because your '/[0-9a-zA-Z]+$/i' pattern just matches any ASCII digit or letter, 1 or more times, at the end of the string and does not check anything before them.

You may fix the expression using

'~^[\w:-]+$(?<!:)~'

See the regex demo.

It matches:

  • ^ - start of string
  • [\w:-]+ - 1 or more word chars (here, ASCII letters, digits or _), : or - chars
  • $ - end of string (you may also use \z to match the very end of the string, or just add D after the regex delimiter ~)
  • (?<!:) - a negative lookbehind that fails the match if there is a : char right at the end of the string.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Nice. What does `~` do here? is this `php` thing? like `/^regex/`? – JBone Dec 28 '17 at 13:06
  • @JBone `~` is a regex delimiter that I prefer to `/` (`~` is used more seldom than `/` in regex patterns). – Wiktor Stribiżew Dec 28 '17 at 13:07
  • @WiktorStribiżew May I ask why 'hihi%' pass `/^[\w:-]+[^:]$/` – Chan Dec 28 '17 at 23:18
  • @Chan [`/^[\w:-]+[^:]$/`](https://regex101.com/r/Ehj7Wq/1) passes `hihi%` because it starts with 1+ word, `:` or `-` chars and ends with a char other than `:` (`%` is not a `:`). You need `/^[\w:-]+$(?<!:)/`, or `/^[\w:-]+(?<!:)$/` or `/^(?!.*:$)[\w:-]+$/`. – Wiktor Stribiżew Dec 28 '17 at 23:31
  • Thanks a lot, regex master – Chan Dec 29 '17 at 01:29
  • @WiktorStribiżew what is the method name of `(?<!:)` in regex, I want to learn it – Chan Dec 29 '17 at 01:32
  • @Chan, `(?<!...)` is a [**negative lookbehind**](https://www.regular-expressions.info/lookaround.html). Here is [one of my answers explaining how a lookbehind works](https://stackoverflow.com/questions/40512766/star-vs-plus-quantifier-in-the-variable-width-negative-lookbehind/40516662#40516662). – Wiktor Stribiżew Dec 29 '17 at 09:54