0

I had been following the thread How to split a string in Java and had been successful.

But in the current usecase the String I am dealing with contains the special characters.

I am having a String as https://{domain name}/{type of data}/4583236-{name-of-perpetrators} and I want to extract 4583236 out of it.

The QA How to split the string using '^' this special character in java? is more or less related to the Question I already have mentioned previously but doesn't helps in my usecase.

My program is throwing PatternSyntaxException: Illegal repetition randomly on either of the special characters.

Code Block :

    String current_url = "https://{domain name}/{type of data}/4583236-{name-of-perpetrators}";
    String[] urlParts = current_url.split("type of data}/");
    String mySuburl = urlParts[1];
    String[] suburl = mySuburl.split("-{name-of-perpetrators");
    String mytext = suburl[0];
    System.out.println(mytext);

Error Stack Trace :

Exception in thread "main" java.util.regex.PatternSyntaxException: Illegal repetition
{name-of-perpetrators
    at java.util.regex.Pattern.error(Unknown Source)
    at java.util.regex.Pattern.closure(Unknown Source)
    at java.util.regex.Pattern.sequence(Unknown Source)
    at java.util.regex.Pattern.expr(Unknown Source)
    at java.util.regex.Pattern.compile(Unknown Source)
    at java.util.regex.Pattern.<init>(Unknown Source)
    at java.util.regex.Pattern.compile(Unknown Source)
    at java.lang.String.split(Unknown Source)
    at java.lang.String.split(Unknown Source)
    at demo.TextSplit.main(TextSplit.java:18)
Youcef LAIDANI
  • 55,661
  • 15
  • 90
  • 140
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352

3 Answers3

4

Try to use Pattern.quote to avoid escaping character by character, It will do that free for you :

String[] suburl = mySuburl.split(Pattern.quote("-{name-of-perpetrators"));
Youcef LAIDANI
  • 55,661
  • 15
  • 90
  • 140
2

The argument for split is a regex. So, you need to escape the special characters used in regex like {. {} is used to denote repetition in regex and hence the error Illegal repetition.

String[] suburl = mySuburl.split("-\\{name-of-perpetrators");

If you don't want the argument for split to be a regex, use Pattern.quote to avoid escaping as @YCF_L suggested.

String[] suburl = mySuburl.split(Pattern.quote("-{name-of-perpetrators"));
Bless
  • 5,052
  • 2
  • 40
  • 44
1

There is literally no reason to use something as complex as regular expression patterns for something as simple as finding literal string contained in another string.

Using indexOf and substring is sufficient:

String text = "https://{domain name}/{type of data}/4583236-{name-of-perpetrators}";
String searchStart = "{type of data}/";
String searchEnd = "-{name-of-perpetrators}";
int start = text.indexOf(searchStart) + searchStart.length();
int end = text.indexOf(searchEnd, start);

String expected = "4583236";
assertEquals(expected, text.substring(start, end));

Obviously, if at any point input text might not have exactly this format, then this approach might fail, for example by making start or end variables negative. If that is the case, you should check for it and handle it appropriately.

M. Prokhorov
  • 3,894
  • 25
  • 39
  • Thanks. Great idea beyond any doubt. But I won't be knowing what exactly the expected value is. Hence my question was how can I extract the number. – undetected Selenium Jan 10 '18 at 16:30
  • I'm not clear on what you're asking. Calling `substring` with the indices does extract the proper number from initial text, as shown by the test What else do you want to know? How to convert `String` to a number? – M. Prokhorov Jan 11 '18 at 06:23