0

I'm trying to find a regexp that covers a lot of outcomes, the one I'm using now would be enough if it weren't for a lot of international names having special letters in them as well as hyphens.

The one I'm using now looks like this:

/^[A-Za-zåäöÅÄÖ\s\-\ ]*$/

It allows for hyphens and whitespace but it also allows them at the start or end of the string which I don't want to allow.

I need to modify this to allow:

  • Special letters such as éýÿüåäö etc. (preferrably by not having to write them all manually)
  • Capital letter at the start of each new word
  • Whitespace between words
  • - hyphens between words, but not before or after the full string
  • It should not allow numbers, which it doesn't already. Since I haven't worked a whole lot with regex construction I'm in the dark on how to achieve this, I've found a lot of solutions that covers one or the other scenario, but not all of the ones I need. I would appreciate the assistance. The regex should work for PHP validation.

    EDIT:

    $fname = 'Scrooge Mc-Duck'; //Only example string
    $fname = trim($fname);
    
    if (!preg_match('/^\p{Lu}\p{Ll}+([ -]+\p{Lu}\p{Ll}+)*$/', $fname)) {
        $fnameErr = 'Invalid first name'; 
    }
    

    This outputs the error when using @npinti's solution.

    Chrillewoodz
    • 27,055
    • 21
    • 92
    • 175
    • What regex engine/platform? – Alex K. Jan 09 '15 at 15:04
    • @AlexK. I'm using PHP to validate if that's what you mean. – Chrillewoodz Jan 09 '15 at 15:05
    • See `\p{L}` @ http://php.net/manual/en/regexp.reference.unicode.php & http://www.regular-expressions.info/unicode.html – Alex K. Jan 09 '15 at 15:08
    • I think that if your system is running in utf-8 you can match any "Perl Word" character (including Cyrillic, Japanese, Arabic and so on characters) with `\w` : http://stackoverflow.com/questions/5555613/does-w-match-all-alphanumeric-characters-defined-in-the-unicode-standard – CD001 Jan 09 '15 at 15:21
    • 1
      Trying to match names is futile. Who says that name parts will only start with uppercase letters? Who says that names cannot include digits? Please read [Falsehoods Programmers Believe About Names](http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/) for some perspective. – amon Jan 09 '15 at 16:39
    • @amon you can match names quite easily `/^.*$/` does the trick :P – CD001 Jan 09 '15 at 16:41
    • @amon Well that's true but I would prefer if people can't enter ridiculous made up names. That's why I want to be somewhat strict. – Chrillewoodz Jan 09 '15 at 17:51

    1 Answers1

    0

    Assuming that your regular expression engine can expose character classes. You can use \p{L} to match any letter. So, to match a name, you could use ^\p{Lu}\p{Ll}+([ -]+\p{Lu}\p{Ll}+)*$.

    This would allow you to match an upper case letter followed by one or more lower case letters. In turn, this can be followed by a combination of 0 or more white spaces and dashes and is then followed by an upper case letter and one ore more lower case letters. The ^ and $ at the beginning and end make sure that the regular expression matches the entire string.

    A demo of the regex can be viewed here.

    npinti
    • 51,780
    • 5
    • 72
    • 96
    • It doesn't appear to support it unfortunately, is there an alternative? – Chrillewoodz Jan 09 '15 at 15:18
    • @Chrillewoodz: What version of PHP you are using? – npinti Jan 09 '15 at 15:20
    • @PaulCrovella: I have tried using the approach you mentioned but for some reason I could not get it to work. If you can provide an example it would be useful :). – npinti Jan 09 '15 at 15:21
    • @Chrillewoodz: According to [this](http://php.net/manual/en/regexp.reference.unicode.php) you should be covered. – npinti Jan 09 '15 at 15:22
    • Ye I read that so I'm not sure why it's not working, maybe I'm doing it wrong? I'll update question and you can have a look. – Chrillewoodz Jan 09 '15 at 15:24
    • Also found out that it doesn't allow for example 'christoffer' but it allows 'Christoffer' – Chrillewoodz Jan 09 '15 at 15:30
    • @Chrillewoodz: Yes that is one of your requirements no? Also, with regards to the PHP not working, I managed to execute it [here](http://writecodeonline.com/php/). Did you try adding the `u` flag as recommended by `Paul Crovella`? – npinti Jan 09 '15 at 15:35
    • Ye I tried it but it still doesn't work for some reason. – Chrillewoodz Jan 09 '15 at 15:39
    • @Chrillewoodz: Unfortunately I cannot do much more. I would recommend you start by using something like `\p{Lu}` and try to match a sample string such as `Bob`. This way, if it works, you know that there is something wrong with the regex. But if it fails, then you know that most likely you do not have the support you need. – npinti Jan 09 '15 at 15:44
    • I tried it and it fails, is there any way to get the support I need? Seems silly that I don't have it since I'm on the correct version of PHP.. Stuff like this drives me insane. – Chrillewoodz Jan 09 '15 at 15:48
    • @Chrillewoodz: First thing that comes to mind would be to try and update your PHP version. Unfortunately I am not a PHP person so I cannot provide further assistance. – npinti Jan 09 '15 at 15:54
    • This RegExp will actually fail on names like *Bob McDonald* or *Paul O'Grady* ... and it's quite legal to change your name by deed poll to just *Trogdor* if you wanted. – CD001 Jan 09 '15 at 16:38