-1

I am trying to detect URL inside another string. I got an answer from another SO. However, it does not work for a use case needed by us.

Detect and extract url from a string?

        URL_REGEX = "(?:^|[\\W])((ht|f)tp(s?):\\/\\/|www\\.)"
            + "(([\\w\\-]+\\.){1,}?([\\w\\-.~]+\\/?)*"
            + "[\\p{Alnum}.,%_=?&#\\-+()\\[\\]\\*$~@!:/{};']*)";
        Pattern p = Pattern.compile(URL_REGEX, Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL);

        String str = "hello example.com";    // DOES NOT WORK 
        //str = "$ANY_WORD example.com $ANY_WORD_1";    // DOES NOT WORK 
        str = "hello http://example.com";    // WORKS

Can you please modify above regex work for str = "hello example.com" as well?

Input String can be a combination of many words and urls

GJain
  • 5,025
  • 6
  • 48
  • 82
  • only for hello or any other word before a URL? – perreal May 22 '18 at 04:01
  • @perreal any word before a URL or after. String can be a combination of many words and urls – GJain May 22 '18 at 04:03
  • The accepted answer in the duplicate question has two regex patterns for matching URL with or without `http(s)://` and `www.`. If you'd like, you can [combine the two together](https://regex101.com/r/zusawb/1) or [like this](https://regex101.com/r/zusawb/2) _with no capturing groups_. – 41686d6564 stands w. Palestine May 22 '18 at 04:22

1 Answers1

0

I did not get it why you started your regex by using Non-Capturing group if you are just searching within a normal string as you indicated. but..

It should work by removing (?:^|[\\W])((ht|f)tp(s?):\\/\\/|www\\.)

But you will be missing the hello unless you add [a-z-]{5}

And i mostly built and test my regular expression by using https://regexr.com/

emakundi
  • 1
  • 3
  • It will not capture http:// in URL. I need it to work for substrings with both http and without http. – GJain May 22 '18 at 04:18
  • `((http|https)\:\/\/)?[a-zA-Z0-9\.\/\?\:@\-_=#]+\.([a-zA-Z0-9\&\.\/\?\:@\-_=#]{3})` but that will take links with generic 3 last tld – emakundi May 22 '18 at 05:44