-4

How do you modify the following regex:

String URLpattern   = "((https?|ftp|gopher|telnet|file|Unsure|http):((//)|(\\\\))+[\\w\\d:#@%/;$()~_?\\+-=\\\\\\.&]*)"

with one that takes into account ALL the following URL forms?

http://www.website.com
https://www.website.com
www.website.com
website.com
http://website.com
https://website.com

EDIT: There are a bunch of already proposed solutions to the problem. I can list some of them:

However, although I tried all of them too, none of them is proposing an explanation of how and why the expression works. Thus, whenever there is a malfunctioning (and to me, a malfunctioning was happening in urls in the form website.com and http://website.com), it becomes difficult for newbies (like me ;) ) to apply any modification or understand what is going on when there is a failure. A well explained solution is always better than a solution made by others and not replicable in the next time =)

Community
  • 1
  • 1
Eleanore
  • 1,750
  • 3
  • 16
  • 33
  • Why did I get so many downvotes? I mean: I looked for other questions related to this topic but none of them is offering a functioning solution... If you think this is not a good question you should require to close it :) – Eleanore May 02 '14 at 13:14
  • We want to see effort. If you've done research, show it and explain why it doesn't help you. – Sotirios Delimanolis May 02 '14 at 13:29

2 Answers2

3

Given your list of websites, the following regex will do the trick:

(https?://)?(www\.)?\w+\.com

Demo

sshashank124
  • 31,495
  • 9
  • 67
  • 76
  • I know the feeling you get when upvotes don't count after reaching 20 upvotes lol – Amit Joki May 02 '14 at 13:04
  • @DavidWallace, Thank you, Answer corrected. The /g might have tagged along by mistake when I was copying. from regex101.com – sshashank124 May 02 '14 at 13:13
  • And also: I guess Java requires two backslashes – Eleanore May 02 '14 at 13:15
  • 1
    @Eleanore Well, not exactly. Having a regexp in a String literal means you have to double the backslashes, because of the rules about how String literals work. If your regexp came from somewhere else (like being read in from a file), you wouldn't have to double the backslashes. – Dawood ibn Kareem May 02 '14 at 13:18
  • @Eleanore, If my answer was helpful, would you mind accepting it? Thank you. – sshashank124 May 03 '14 at 04:47
0

Try this one, it works fine.TESTED with all your URLs!

private static final String URL_PATTERN = "(@)?(href=')?(HREF=')?(HREF=\")?(href=\")?(http://)?(https://)?" +
        "[a-zA-Z_0-9\\-]+(\\.\\w[a-zA-Z_0-9\\-]+)+(/[#&\\n\\-=?\\+\\%/\\.\\w]+)?";

private Pattern urlPattern =Pattern.compile(URL_PATTERN);

public boolean isURL(String url)
{
    boolean mat = urlPattern.matcher(url.trim().replace(" ","")).matches();
    return urlPattern.matcher(url.replace(" ","")).matches();
}
Bettimms
  • 671
  • 5
  • 12