-3

I have a URL which can be

"http://example.com/bar1/checkstatus" or "http://example.com/bar2/checkstatus"

What's the most effective way to search for this URL using the .matches() function in JAVA where example.com remains constant, followed by bar1 or bar2. Rest of the URL can vary.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • What are your *exact and comprehensive* requirements? What have you tried? How isn't it working? – Hovercraft Full Of Eels Oct 07 '19 at 00:14
  • After reading this several times, I’m still not sure whether you’re trying to locate a URL embedded in text, or trying to parse a URL into its parts. – VGR Oct 07 '19 at 11:43

2 Answers2

3

The best way is don't do it that way.

Instead, use the URL or URI class to parse the URL, then extract the "path" component an analyze it further. (You could use a regex to search the path ... after the URL parser has dealt with the escaping.)

Why is using a regex search on the text of a URL a bad idea?

Because:

  • some parts of a URL are case sensitive and others are not
  • some parts of a URL may be encoded
  • some parts of a URL may be order sensitive

A regex that takes account of these things is typically complicated and difficult to read. And if you ignore them, your matching is liable to malfunction when presented with various edge-case URLs.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • What you say definitely makes sense, however as per the above example, ```example.com/bar1``` and ```example.com/bar2``` is a fixed URL whose value is known. Hence I was thinking of the regex approach, something on the lines of ` \/example.com\/(bar1|bar2)\/ ` (i am not sure if that is the correct syntax for JAVA, but when i use it with the .matches function, it returns a false) – Crazy Cat Oct 07 '19 at 00:29
  • The `match` method matches the entire string. – Stephen C Oct 07 '19 at 01:07
0

I'm not sure about the best way, yet I'm guessing that you wish to search/capture the checkstatuss, for which we'd then start with a simple expression:

(?i)^https?://(?:w{3}\.)?example\.com/bar[12]/([^/]*)/?$ 

assuming that there would be optional wwww. ((?:w{3}\.)?), http or https (s?), and ending trailing slashes (/?), which if not, we can simply remove those from the expression:

(?i)^http://example\.com/bar[12]/([^/]*)$ 

Test

import java.util.regex.Matcher;
import java.util.regex.Pattern;


public class RegularExpression{

    public static void main(String[] args){

        final String regex = "(?i)^https?://(?:w{3}\\.)?example\\.com/bar[12]/([^/]*)/?$";
        final String string = "http://example.com/bar1/checkstatus\n"
             + "http://example.com/bar2/checkstatus\n"
             + "https://www.example.com/bar1/checkstatus\n"
             + "https://www.example.com/bar2/checkstatus\n"
             + "http://example.com/bar1/checkstatus/\n"
             + "http://example.com/bar2/checkstatus/\n"
             + "https://www.example.com/bar1/checkstatus/\n"
             + "https://www.example.com/bar2/checkstatus/";

        final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
        final Matcher matcher = pattern.matcher(string);

        while (matcher.find()) {
            System.out.println("Full match: " + matcher.group(0));
            for (int i = 1; i <= matcher.groupCount(); i++) {
                System.out.println("Group " + i + ": " + matcher.group(i));
            }
        }


    }
}

Output

Full match: http://example.com/bar1/checkstatus
Group 1: checkstatus
Full match: http://example.com/bar2/checkstatus
Group 1: checkstatus
Full match: https://www.example.com/bar1/checkstatus
Group 1: checkstatus
Full match: https://www.example.com/bar2/checkstatus
Group 1: checkstatus
Full match: http://example.com/bar1/checkstatus/
Group 1: checkstatus
Full match: http://example.com/bar2/checkstatus/
Group 1: checkstatus
Full match: https://www.example.com/bar1/checkstatus/
Group 1: checkstatus
Full match: https://www.example.com/bar2/checkstatus/
Group 1: checkstatus

If you wish to simplify/modify/explore the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.


RegEx Circuit

jex.im visualizes regular expressions:

enter image description here

Emma
  • 27,428
  • 11
  • 44
  • 69