2

I would like to allow in any charsets (latin, hebrew, cyrillic but not unicode emojis) all letter and minus (-), but it should not allow to use the minus more then once per time and not on start and end:

YaMo -> OK
Ya-Mo -> OK
Ya-Mo-Ga -> OK
Ya--Mo -> FALSE
Ya---Mo -> FALSE
-Ya -> FALSE
-Ya-Mo- -> FALSE
Ya- -> FALSE
Yo-Mo- Mo -> FALSE
Yo-Mo -Go -> FALSE

So far I have:

preg_match('/^[\p{L} -]+$/', $post['firstname'])

It don't take care about multiple occurence and if is on start or end. Exist there any regex approach to do it?

As workaround I now check with substr first and last letter != '-' and check with strpos for '--'.

the proposed similiar duplicate is wrong, as it not work as it should, it allow: Yo-Mo- Mo

nenad007
  • 139
  • 8
  • 1
    Use `'/^\p{L}+(?:[ -]\p{L}+)*$/u'` - https://regex101.com/r/Uv7NUb/1. I think the hyphen rule also pertains to the space, right? – Wiktor Stribiżew May 11 '18 at 09:49
  • 1
    I found the [post](https://stackoverflow.com/a/4897392/3832970) where all you need is to replace `[A-Za-z0-9-]+` with your `[\p{L} -]+`. Also, add `u` to make the regex fully Unicode compatible and make PCRE treat the input text as Unicode chars. – Wiktor Stribiżew May 11 '18 at 09:55
  • can a string without hiphens(-) be also allowed ? – aelor May 11 '18 at 09:57
  • @aelor yes it can – nenad007 May 11 '18 at 09:57
  • @WiktorStribiżew the proposed duplicate is wrong I found out it don't work as it should. – nenad007 May 11 '18 at 10:09
  • **Do you want to allow spaces?** If not, *why is it in your pattern?* You say, `Yo-Mo- Mo` should not be matched, but why? If you do not want spaces, try my first suggestion then, but take out space, `'/^\p{L}+(?:-\p{L}+)*$/u'`. – Wiktor Stribiżew May 11 '18 at 10:22
  • @Wiktor Stribiżew space is only allowed between two words. Yo-Mo- Mo -> not allowed but Yo-Mo Mo is allowed, the idea behind is to allow real names only. – nenad007 May 11 '18 at 10:26
  • So, my top comment regex is working for you, right? Please test. – Wiktor Stribiżew May 11 '18 at 10:26
  • 1
    @WiktorStribiżew yes the top comment solution works, thank you. I concenctrated on the proposed duplicate solution and missed this. – nenad007 May 11 '18 at 10:30

2 Answers2

1

try this:

^\w+(-\w+)*$
  • \w+ matches any word character (equal to [a-zA-Z0-9_]) one or more times
  • (-\w+)* makes sure that the word characters along with hiphen appears zero or more times

online demo

aelor
  • 10,892
  • 3
  • 32
  • 48
1

The regex you may use is

'/^\p{L}+(?:[ -]\p{L}+)*$/u'

See the regex demo.

Details

  • ^ - start of string
  • \p{L}+ - 1+ letters
  • (?:[ -]\p{L}+)* - 0+ repetitions of
    • [ -] - a space or -
    • \p{L}+ - 1+ letters
  • $ - end of string (replace with \z to only match the very end of string, and not also before a final LF symbol).

The modifier u is necessary to make it work without issues with Unicode strings.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563