1

I'm trying to validate a string in PHP using regex; it can only contain letters (including latin letters such as 'á', 'õ', etc) and spaces.

Using preg_replace('/\P{L}/u', '', $ str); I get rid of everything (including the spaces) but the latin letters. What do I need to change on the regex to include the spaces as well?

Picoral
  • 199
  • 1
  • 2
  • 13
  • maybe if you negate the negation: `[^\pL ]` match anything that is not no letter in any language or a space – Sindhara Mar 24 '19 at 17:06

1 Answers1

1

You may use

preg_replace('/[^\p{L}\s]+/u', '', $str);

The [^\p{L}\s]+ pattern will match 1 or more occurrences of any char but a Unicode letter or whitespace. Note that due to u modifier, \s will recognize any Unicode whitespace chars.

See the regex demo.

Details

  • [^ - start of a negated character class that matches any char but
    • \p{L} - any Unicode letter
    • \s - whitespace
  • ]+ - 1 or more times.

If you have diacritics and want to keep them, you will have to add \p{M} to the negated character class, /[^\p{L}\p{M}\s]+/u.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Thanks! How can I use this same regex with JS? I tried `value.match('/[^\p{L}\s]+/u')` in which `value` is `***` but it returns `null` – Picoral Mar 24 '19 at 18:00
  • @fpicoral If you only want to support Crome or any other JS environment supporting ECMAScript 2018, use `value.replace(/[^\p{L}\s]+/gu, '')`. In older standards, try `value.replace(/[^A-Za-z\s]+/g, '')` (they do not support `\p{L}`). Or, see [equivalents here](https://stackoverflow.com/a/37668315/3832970). – Wiktor Stribiżew Mar 24 '19 at 18:55