2

INPUT

Input can be in any of the form shown below with following mandatory content TXT{Any comma separated strings in any format}

String loginURL = "http://ip:port/path?username=abcd&location={LOCATION}&TXT{UE-IP,UE-Username,UE-Password}&password={PASS}";
String loginURL1 = "http://ip:port/path?username=abcd&location={LOCATION}&password={PASS}&TXT{UE-IP,UE-Username,UE-Password}";
String loginURL2 = "http://ip:port/path?TXT{UE-IP,UE-Username,UE-Password}&username=abcd&location={LOCATION}&password={PASS}";
String loginURL3 = "http://ip:port/path?TXT{UE-IP,UE-Username,UE-Password}";
String loginURL4 = "http://ip:port/path?username=abcd&password={PASS}";

Required Output

1. OutputURL corresponding to loginURL.

String outputURL = "http://ip:port/path?username=abcd&location={LOCATION}&password={PASS}";
String outputURL1 = "http://ip:port/path?username=abcd&location={LOCATION}&password={PASS}";
String outputURL2 = "http://ip:port/path?username=abcd&location={LOCATION}&password={PASS}";
String outputURL3 = "http://ip:port/path?";
String outputURL4 = "http://ip:port/path?username=abcd&password={PASS}";

2. Deleted pattern(if any)

String deletedPatteren = TXT{UE-IP,UE-Username,UE-Password}

My Attempts

String loginURLPattern = TXT+"\\{([\\w-,]*)\\}&*";

System.out.println("1. ");
getListOfTemplates(loginURL, loginURLPattern);
System.out.println();

System.out.println("2. ");
getListOfTemplates(loginURL1, loginURLPattern);
System.out.println();

private static void getListOfTemplates(String inputSequence,String pattern){
    System.out.println("Input URL : " + inputSequence);
    Matcher templateMatcher =  Pattern.compile(pattern).matcher(inputSequence);
    if (templateMatcher.find() && templateMatcher.group(1).length() > 0) {
        System.out.println(templateMatcher.group(1));
        System.out.println("OutputURL : " + templateMatcher.replaceAll(""));
    }
}

OUTPUT obtained

1. 
Input URL : http://ip:port/path?username=abcd&location={LOCATION}&TXT{UE-IP,UE-Username,UE-Password}&password={PASS}
UE-IP,UE-Username,UE-Password}&password={PASS
OutputURL : http://ip:port/path?username=abcd&location={LOCATION}&

2. 
Input URL : http://ip:port/path?username=abcd&location={LOCATION}&password={PASS}&TXT{UE-IP,UE-Username,UE-Password}
UE-IP,UE-Username,UE-Password
OutputURL : http://ip:port/path?username=abcd&location={LOCATION}&password={PASS}&

DRAWBACK OF ABOVE PATTERN

If i add any String containing character like #,%,@ in between TXT{} then my code breaks.

How can i achieve it using java.util.regex library so that user can input any comma separated String between TXT{Any Comma Separated Strings}.

Prateek
  • 12,014
  • 12
  • 60
  • 81

1 Answers1

4

I would recommend using Matcher.appendReplacement:

public static void main(final String[] args) throws Exception {
    final String[] loginURLs = {
        "http://ip:port/path?username=abcd&location={LOCATION}&TXT{UE-IP,UE-Username,UE-Password}&password={PASS}",
        "http://ip:port/path?username=abcd&location={LOCATION}&password={PASS}&TXT{UE-IP,UE-Username,UE-Password}",
        "http://ip:port/path?TXT{UE-IP,UE-Username,UE-Password}&username=abcd&location={LOCATION}&password={PASS}",
        "http://ip:port/path?TXT{UE-IP,UE-Username,UE-Password}",
        "http://ip:port/path?username=abcd&password={PASS}"};
    final Pattern patt = Pattern.compile("(\\?)?&?(TXT\\{[^}]++})(&)?");
    for (final String loginURL : loginURLs) {
        System.out.printf("%1$-10s %2$s%n", "Processing", loginURL);
        final StringBuffer sb = new StringBuffer();
        final Matcher matcher = patt.matcher(loginURL);
        while (matcher.find()) {
            final String found = matcher.group(2);
            System.out.printf("%1$-10s %2$s%n", "Found", found);
            if (matcher.group(1) != null && matcher.group(3) != null) {
                matcher.appendReplacement(sb, "$1");                
            } else {
                matcher.appendReplacement(sb, "$3");
            }
        }
        matcher.appendTail(sb);
        System.out.printf("%1$-10s %2$s%n%n", "Processed", sb.toString());
    }
}

Output:

Processing http://ip:port/path?username=abcd&location={LOCATION}&TXT{UE-IP,UE-Username,UE-Password}&password={PASS}
Found      TXT{UE-IP,UE-Username,UE-Password}
Processed  http://ip:port/path?username=abcd&location={LOCATION}&password={PASS}

Processing http://ip:port/path?username=abcd&location={LOCATION}&password={PASS}&TXT{UE-IP,UE-Username,UE-Password}
Found      TXT{UE-IP,UE-Username,UE-Password}
Processed  http://ip:port/path?username=abcd&location={LOCATION}&password={PASS}

Processing http://ip:port/path?TXT{UE-IP,UE-Username,UE-Password}&username=abcd&location={LOCATION}&password={PASS}
Found      TXT{UE-IP,UE-Username,UE-Password}
Processed  http://ip:port/path?username=abcd&location={LOCATION}&password={PASS}

Processing http://ip:port/path?TXT{UE-IP,UE-Username,UE-Password}
Found      TXT{UE-IP,UE-Username,UE-Password}
Processed  http://ip:port/path

Processing http://ip:port/path?username=abcd&password={PASS}
Processed  http://ip:port/path?username=abcd&password={PASS}

As you rightly point out, there are 3 possible cases:

  1. "?{TEXT}&" -> "?"
  2. "&{TEXT}&" -> "&"
  3. "?{TEXT}" -> ""

So what we need to do is test for those cases in the regex. Here is the pattern:

(\\?)?&?(TXT\\{[^}]++})(&)?

Explanation:

  • (\\?)? optionally matches and captures a ?
  • &? optionally captures an &
  • (TXT\\{[^}]++}) matches and captures TXT, followed by {, followed by one or most not } (possessively), followed by } (closing brackets don't need to be escaped
  • (&)? optionally matches and captures a &

We have 3 groups:

  1. potentially a ?
  2. the required text
  3. potentially an &

Now when we find a match we need to replace with the appropriate capture of case 1..3

if (matcher.group(1) != null && matcher.group(3) != null) {
    matcher.appendReplacement(sb, "$1");                
} else {
    matcher.appendReplacement(sb, "$3");
}

If groups 1 and 3 are both present:

We must be in case 1; we must replace with "?" which is in group 1 so $1.

Otherwise we are in case 2 or 3:

In case 2 we need to replace with "&" and in 3 with "".
In case 2 group 3 will hold "&" and in case 3 it will hold "" so we can replace with $3 in both these cases.

Here I only capture the TXT{...} part using a match group. This means that although the leading ? or & is replaced it is not in the String found. I you only want the bit between {} then just move the parenthesis.

Note that I reuse the Pattern - you can also reuse the Matcher if performance is a concern. You should always reuse the Pattern as it is (very) expensive to create. Store it in a static final if you can - it's threadsafe, matchers are not. The usual way to do it is to store the Pattern in a static final and then reuse the Matcher in the context of a method.

Also, the use of Matcher.appendReplacement is much more efficient than your current approach as it only needs to process the input once. Your approach parses the string twice.

Community
  • 1
  • 1
Boris the Spider
  • 59,842
  • 6
  • 106
  • 166
  • 1
    I also required the replaced String. – Prateek Mar 21 '14 at 09:10
  • It has not solved my problem, For third input, actual is *http://ip:port/path&username=abcd&location={LOCATION}&password={PASS}* and desired is *http://ip:port/path?username=abcd&location={LOCATION}&password={PASS}* – Prateek Mar 21 '14 at 09:51
  • Please take your time, i will wait for your solution. – Prateek Mar 21 '14 at 10:29
  • @Prateek [this](http://stackoverflow.com/questions/22557708/regex-possesive-quantifier/22558236#22558236) might be of interest - that's why it took quite so long to find the answer. – Boris the Spider Mar 21 '14 at 12:48