1

My regex works on validating first and last names. The acceptable forms are as follows:

  • Jacob Wellman
  • Wellman, Jacob
  • Wellman, Jacob Wayne
  • O’Shaughnessy, Jake L.
  • John O’Shaughnessy-Smith
  • Kim

The unacceptable forms are as follows:

  • Timmy O’’Shaughnessy
  • John O’Shaughnessy--Smith
  • K3vin Malone
  • alert(“Hello”)
  • select * from users;

My current regex is as follows.

^[\w'\-,.][^0-9_!¡?÷?¿\\+=@#$%ˆ&*(){}|~<>;:[\]]{2,}$

It works properly for validating all of the names except for:

  • Timmy O’’Shaughnessy
  • John O’Shaughnessy--Smith

The reason for this is that the regex doesn't take into account consecutive identical special characters. How can I change my regex to take those into account?

Ali Khabib
  • 25
  • 2

3 Answers3

1

You can exclude consecutive characters by using a negative lookahead with a backreference to assert not a character directly followed by the same character ^(?!.*([’-])\1

Note that your current pattern matches names that are at least 3 letter long, and will not match for example names like Al

If you want to match that as well, you can change {2,} to + in the pattern.

^(?!.*([’-])\1)[\w',.-][^\n\r0-9_!¡?÷¿\\+=@#$%ˆ&*(){}|~<>;:[\]]{2,}$

Regex demo

Matching names can be difficult, this page has an interesting read about names:

Falsehoods Programmers Believe About Names

The fourth bird
  • 154,723
  • 16
  • 55
  • 70
0
^(:?[^0-9'\-\., _!¡?÷?¿\\+=@#$%ˆ&*(){}|~<>;:[\]]+(:?['-]|, | |\.|\. |$))+$

I used your forbidden characters set and added '\-\., . Then I let them repeat +. I insert a group of allowed divisors: (:?['-]|, | |\.|\. |$) and allow repeating this pattern +.
I tried it here.

David Lukas
  • 1,184
  • 2
  • 8
  • 21
0

You could do it separately, before your validation. With a Perl regex, to remove additional special characters, it would be:

s/(\W)\1+/$1/g

so for example:

$ echo "John O’’Shaughnessy--Smith" | perl -C -pe 's/(\W)\1+/$1/g'
John O’Shaughnessy-Smith
mivk
  • 13,452
  • 5
  • 76
  • 69