5

I am trying to validate a String which contains the first & last name of a person. The acceptable formats of the names are as follows.

Bruce Schneier                  
Schneier, Bruce
Schneier, Bruce Wayne
O’Malley, John F.
John O’Malley-Smith
Cher

I came up with the following program that will validate the String variable. The validateName function should return true if the name format matches any of the mentioned formats able. Else it should return false.

import java.util.regex.*;

public class telephone {

    public static boolean validateName (String txt){
        String regx = "^[\\\\p{L} .'-]+$";
        Pattern pattern = Pattern.compile(regx, Pattern.CASE_INSENSITIVE);
        Matcher matcher = pattern.matcher(txt);
        return matcher.find();

    }

    public static void main(String args[]) {

        String name = "Ron O’’Henry";

        System.out.println(validateName(name));

    }
}

But for some reason, it is returning false for any value. What am I doing wrong here?

Aminah Nuraini
  • 18,120
  • 8
  • 90
  • 108
Kemat Rochi
  • 932
  • 3
  • 21
  • 35
  • 2
    What happens with `John Von Neumann`? How about `Eddie Van Halen`? `Hans Ten Brink`? `Arturo Dell'Antonio`? I spent seven years working for a publishing company and hundreds of hours working on this problem. Answer: There is no algorithm that can do this 100%, and you need to build and maintain a list of surname prefixes. It's a ***hard*** problem. – Jim Garrison Apr 25 '16 at 03:33
  • The issue with your regex is the four backslashes.. when you only need 2, but that doesn't make your regex actually match all the tests. – billjamesdev Apr 25 '16 at 03:34

3 Answers3

4

Use this:

^[\p{L}\s.’\-,]+$

Demo: https://regex101.com/r/dQ8fK8/1

Explanation:

  1. The biggest problem you have is ' and is different. You can only achieve that character by copy pasting from the text.
  2. Use \- instead of - in [] since it will be mistaken as a range. For example: [a-z]
  3. You can use \s instead of for matching any whitespaces.
Aminah Nuraini
  • 18,120
  • 8
  • 90
  • 108
3

You can do:

^[^\s]+,?(\s[^\s]+)*$
heemayl
  • 39,294
  • 7
  • 70
  • 76
1

You put too many backslashes in the regex: "^[\\\\p{L} .'-]+$"
After Java literal interpretation, that is: ^[\\p{L} .'-]+$
Which means match any combination of the following characters:

\  p  {  L  }  space  .  '  -

If you change to: "^[\\p{L} .'-]+$"
Regex will see: ^[\p{L} .'-]+$
Which means match any combination of the following characters:

letters  space  .  '  -

BUT: Don't validate names.

See What are all of the allowable characters for people's names?, which leads to Personal names around the world.

In short: You can't, so don't.

Community
  • 1
  • 1
Andreas
  • 154,647
  • 11
  • 152
  • 247
  • Thank you for sharing your thoughts. That little snippet was only part of a classroom stuff. But I'll definitely keep your suggestions in mind when I'm working with real world projects. – Kemat Rochi Apr 25 '16 at 03:53