0

String is considered valid if contains alphanumeric and _ and - and Thai characters, otherwise invalid. Characters like . or @ are invalid.

Based on the PHP documentation, the following regex should be working:

^[\w\-\p{Thai}]+$

It even seems to work as expected here: https://regex101.com/r/rfwjng/1

However doesn't work in my PHP code, nor working here: https://www.phpliveregex.com/p/wDf

private function containsInvalidCharacters($value) {
    return !preg_match('/^[\w\-\p{Thai}]+$/', $value);
}

Update 1:

To clarify if I do /u at the end, it starts matching unwanted characters on my local machine. Although that seems to work on the referenced link (as suggested in the comment.)

Update 2:

Issue resolved when using '/^[\wก-๙-]+$/u'. For some reason \p{Thai} was not giving consistent results across PHP versions. See here: https://3v4l.org/4hB9e

Tom
  • 316
  • 2
  • 9
  • 30
  • You need `u` flag, `'/^[\w\-\p{Thai}]+$/u'`. Also, it is safer to move `-` to the end, ``'/^[\w\p{Thai}-]+$/u'`` – Wiktor Stribiżew Jul 30 '20 at 13:11
  • No, `u` works [as expected](https://3v4l.org/4hB9e). – Wiktor Stribiżew Jul 30 '20 at 13:17
  • `u` is for UTF-8. If it doesn't work for you, what encoding are you using? – Álvaro González Jul 30 '20 at 15:02
  • @ÁlvaroGonzález I checked and it's UTF-8 too when I do, from the same screenshot: https://imgur.com/a/e4w1Kyc – Tom Jul 30 '20 at 15:55
  • 1
    Thai person swing by, this is what I use for Thai character /[ก-๙]+/ https://regex101.com/r/sDPcjH/1 – Doppio Jul 30 '20 at 16:14
  • Apparently I just noticed, in this same link https://3v4l.org/4hB9e, the results are varying based on PHP versions. So the results are not consistent. Thanks Doppio for sharing that - that seems to solve the issue. Maybe post as an answer? – Tom Jul 30 '20 at 16:23

0 Answers0