0

I have String user@domain:port

I want to fetch user, domain and port from this String.

So I created regex:

public static final String MATCH_USER_DOMAIN_PORT = "^([0-9,a-zA-Z-.*_]+)@([a-z0-9]+[\\.-][a-z0-9]+\\.[a-z]{2,}+):(6553[0-5]|655[0-2]\\d|65[0-4]\\d{2}|6[0-4]\\d{3}|[1-5]\\d{4}|[1-9]\\d{0,3})$";

and this is my method in Unitest so far:

public void test____matchesUserDomainWithPort(){

     String identityText = "maxim@domain.com:5555";
        String user = "";
        String domain = "";
        String port = "";

        if(identityText.matches(MATCH_USER_DOMAIN_PORT))
        {                                
            Pattern p = Pattern.compile(MATCH_USER_DOMAIN_PORT);
            Matcher m = p.matcher(identityText);

            user = m.group(1);
            domain= m.group(2);
            port= m.group(3);
        }

    assertEquals("maxim", user);
    assertEquals("domain.com", domain);
    assertEquals("5555", port);

}

I get error:

 java.lang.IllegalStateException: No successful match so far
 at java.util.regex.Matcher.ensureMatch(Matcher.java:607)
 ....

in row: user = m.group(1);

I opened http://gskinner.com/RegExr/?2v5r0

and there all seems good:

Output:

RegExp: /^([0-9,a-zA-Z-.*_]+@[a-z0-9]+([\.-][a-z0-9]+)*)+\.[a-z]{2,}+:(6553[0-5]|655[0-2]\d|65[0-4]\d{2}|6[0-4]\d{3}|[1-5]\d{4}|[1-9]\d{0,3})$/
pattern: ^([0-9,a-zA-Z-.*_]+@[a-z0-9]+([\.-][a-z0-9]+)*)+\.[a-z]{2,}+:(6553[0-5]|655[0-2]\d|65[0-4]\d{2}|6[0-4]\d{3}|[1-5]\d{4}|[1-9]\d{0,3})$
flags: 
3 capturing groups: 
   group 1: ([0-9,a-zA-Z-.*_]+@[a-z0-9]+([\.-][a-z0-9]+)*)
   group 2: ([\.-][a-z0-9]+)
   group 3: (6553[0-5]|655[0-2]\d|65[0-4]\d{2}|6[0-4]\d{3}|[1-5]\d{4}|[1-9]\d{0,3})

Do I miss something?

in C i just write: sscanf(identityText,"%[^@]@%[^:]:%511s",user,domain,port);

For sure I can split this text with @ and : and get 3 values, but its interesting how to do that in gentle form :)

Please, help

Maxim Shoustin
  • 77,483
  • 27
  • 203
  • 225
  • Your regex looks wrong. The first inner group will eat up the ".com", but another "." is required after that. Try getting rid of the "[a-z0-9]+" between the "@" and that first inner group. – Rob I Apr 16 '13 at 12:32
  • it works fine, the row: `if(identityText.matches(MATCH_USER_DOMAIN_PORT))` return `true` and i cant get 1st group – Maxim Shoustin Apr 16 '13 at 12:36

2 Answers2

1

Please use

if(identityText.matches(MATCH_USER_DOMAIN_PORT)){                                
     Pattern p = Pattern.compile(MATCH_USER_DOMAIN_PORT);
     Matcher m = p.matcher(identityText);
     while(m.find()){
        user = m.group(1);
        domain= m.group(2);
        port= m.group(3);
    }
} 

thanks

Amit Sharma
  • 1,202
  • 11
  • 26
  • I'm glad it works, and the `m.find()` is necessary, but I can't believe your second group is matching "domain.com". Maybe I'm just confused. – Rob I Apr 16 '13 at 12:50
0

Yes, I think your regex is wrong.

public static final String MATCH_USER_DOMAIN_PORT = "^([0-9,a-zA-Z-.*_]+@[a-z0-9]+([\\.-][a-z0-9]+)*)+\\.[a-z]{2,}+:(6553[0-5]|655[0-2]\\d|65[0-4]\\d{2}|6[0-4]\\d{3}|[1-5]\\d{4}|[1-9]\\d{0,3})$";

To break it down:

  • ^(
  • [0-9,a-zA-Z-.*_]+
    • any number of these characters, will match "maxim"
  • @
    • will match "@"
  • [a-z0-9]+
    • any number of these characters, will match "domain"
  • ([\\.-][a-z0-9]+)*
    • will match ".com" (or theoretically ".somethingelse.com", nice)
  • )+
    • will make group #2 "maxim@domain.com", I believe, but what's with the "+" ?
  • \\.
    • nothing in the input string here
  • [a-z]{2,}+
    • is this for a country code like .eu ? Again, what's with the "+" ?
  • :
  • (6553[0-5]|655[0-2]\\d|65[0-4]\\d{2}|6[0-4]\\d{3}|[1-5]\\d{4}|[1-9]\\d{0,3})
    • seems overly complicated - probably don't do the numeric validation with the regex
  • $

Take a look at Using a regular expression to validate an email address for some advice on validation of email addresses.

Community
  • 1
  • 1
Rob I
  • 5,627
  • 2
  • 21
  • 28