0

What is a good one-liner php regex for checking first/last name fields with accented characters (in case someone's name was Pièrre), that could match something like:

<?php
$strErrorMessage = null;

if(!preg_match('/\p{L}0-9\s-+/u', trim($_POST["firstname"])))
  $strErrorMessage = "Your first name can only contain valid characters, ".
    "spaces, minus signs, or numbers.";
?>

This tries to use unicode verification, from this post, but doesn't work correctly. The solution seems pretty hard to google.

Community
  • 1
  • 1
Florian Mertens
  • 2,418
  • 4
  • 27
  • 37
  • 2
    Huh? You want to exclude apostrophes why? Also, I've never heard of a name containing a "minus sign", though I'm aware of many containing hyphens. – TRiG Feb 27 '14 at 11:58
  • @TRiG: I'm pretty sure that the "minus sign" and a hyphen (when typing) are exactly the same.. – Magictallguy Feb 27 '14 at 12:00
  • It's a long shot, but try also `\p{M}` – ex3v Feb 27 '14 at 12:02
  • 1
    As a general rule, [do not try to validate name fields](http://ux.stackexchange.com/a/15778/2131). Also, @Magictallguy, it's the same character as far as a programmer is concerned, yes, but calling it a minus sign in the context of a name is just plain weird. – TRiG Feb 27 '14 at 12:03
  • @TriG: Aye, granted. I'll give you that :P – Magictallguy Feb 27 '14 at 12:05
  • Sooooo... what exactly is the validation "rule"? Name may contain... characters? – deceze Feb 27 '14 at 12:25
  • A name contains, and should only contain characters. I've never heard of someone called 'florian :)' or '_-*-Sarah-*-_' on their official documents. There are some exceptions imho, like 'Jean-Pierre' and some cultures with names that are split. Therefore, my validation rule for a name should be unicode characters with optional dashes (or minus sign, but some international places have rarely heard of the word 'dash', or they use the word 'hyphen', so which word would you then use?), and perhaps a space. – Florian Mertens Feb 27 '14 at 18:19
  • [Falsehoods Programmers Believe About Names](http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/) – deceze Feb 27 '14 at 20:18

1 Answers1

0

Aside from the difficulty to validate a name, you need to put your characters into a character class. /\p{L}0-9\s-+/u matches only on a sequence like "Ä0-9 ------". What you wanted to do is

/^[\p{L}0-9\s-]+$/u

Additionally I added anchors, they ensure that the regex tries to match the complete string.

As ex3v mentioned you should probably add \p{M} to that class to match also combination characters. See Unicode properties.

/^[\p{L}\p{M}0-9\s-]+$/u
stema
  • 90,351
  • 20
  • 107
  • 135
  • ...I had hoped that you would add the php code in this too. – Florian Mertens Feb 27 '14 at 18:20
  • @FlorianMertens, which code? `if(!preg_match('/^[\p{L}\p{M}0-9\s-]+$/u', trim($_POST["firstname"]))) $strErrorMessage = "Your first name can only contain valid characters, ". "spaces, minus signs, or numbers.";` ??? – stema Feb 28 '14 at 06:33