2

Regex pattern /[^[:ascii:]]+/ui will match one or more non-ascii characters.

Regex pattern /[\p{L}]+/ui will match one or more characters in unicode 'letter' class.

I can't figure out a way how to match one or more characters that are in unicode 'letter' class AND are not ascii characters.

Karolis
  • 2,580
  • 2
  • 16
  • 31

2 Answers2

2

You can use a negated character class like this:

[^\P{L}[:ascii:]]+

RegEx Demo 1

This will match 1+ of any character that is not an ASCII and not matched by \P{L} (inverse of \p{L})


Alternatively, you can use negative lookahead in a non-capture group:

(?:(?![[:ascii:]])\p{L})+

RegEx Demo 2

anubhava
  • 761,203
  • 64
  • 569
  • 643
1

You can use

[^\P{L}A-Za-z]+

It matches any Unicode letter that is not equal to ASCII letter.

See the regex demo.

In PHP, you should use the u flag to make it work correctly with Unicode strings:

$regex = '/[^\P{L}A-Za-z]+/u';
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563