31

What's the best way to check if a String contains a URL in Java/Android? Would the best way be to check if the string contains |.com | .net | .org | .info | .everythingelse|? Or is there a better way to do it?

The url is entered into a EditText in Android, it could be a pasted url or it could be a manually entered url where the user doesn't feel like typing in http://... I'm working on a URL shortening app.

William L.
  • 3,846
  • 9
  • 53
  • 72
  • What kind of URL do you expect? Relative URL is hard to detect. `/` character is one way, but tends to general false positive. – nhahtdh Jun 13 '12 at 01:18
  • Will it always start with a protocol? Can you just try to parse it with `URL`? – Dave Newton Jun 13 '12 at 01:18
  • 3
    Good luck with this once the new [GTLDs](http://en.wikipedia.org/wiki/Generic_top-level_domain#New_top-level_domains) come out ;) – Brendan Long Jun 13 '12 at 01:28
  • 1
    @WilliamL. : You don't really give enough information. When you say "if a String contains a URL", how about..."Hey Dave, I found this great site called blah.com you should visit it"? What I mean is where are your strings coming from? `blah.com` in this case could be a valid URL but are you parsing any generic text or...well, whatever. Your question is pretty vague. As Dave Newton suggests the `URL` class (and the URI class)` can be used for parsing. – Squonk Jun 13 '12 at 01:29
  • Anything that ends with a .com or .anything it doesn't have to have http:// at the beginning.. @Squonk I updated my answer – William L. Jun 13 '12 at 01:39

11 Answers11

41

Best way would be to use regular expression, something like below:

public static final String URL_REGEX = "^((https?|ftp)://|(www|ftp)\\.)?[a-z0-9-]+(\\.[a-z0-9-]+)+([/?].*)?$";

Pattern p = Pattern.compile(URL_REGEX);
Matcher m = p.matcher("example.com");//replace with string to compare
if(m.find()) {
    System.out.println("String contains URL");
}
Chandra
  • 1,317
  • 11
  • 14
  • SOLVED! Most accurate and acceptable answer! Thank you! – Rafique Mohammed Oct 11 '14 at 11:10
  • 11
    this doesn't work. For text `hehe, check this link: http://www.example.com/` m.find() returns false – lxknvlk Oct 18 '16 at 15:06
  • 1
    Also for any string as [a-z0-9.][a-z0-9] it will return true. So "asdj.asdj" will be positive – Tom Aug 16 '17 at 07:42
  • The answer tries to find whether an entire string is a URL, which is why it doesn't work for the nice counterexample from @lxknvlk . But all that is needed is a minor change to the regular expression. Use `\b` instead of the anchors '^' and '$'. So the pattern string would be `"\\b((https?|ftp)://|(www|ftp)\\.)?[a-z0-9-]+(\\.[a-z0-9-]+)+([/?].*)?\\b"`. And if the goal is to find only proper URLs, not plain website names, then set `URL_REGEX` to `"\\b(https?|ftp)://[a-z0-9-]+(\\.[a-z0-9-]+)+([/?].*)?\\b"`, and this might address the concern raised by @Tom – so2 Jun 18 '22 at 07:27
10

This is simply done with a try catch around the constructor (this is necessary either way).

String inputUrl = getInput();
if (!inputUrl.contains("http://"))
    inputUrl = "http://" + inputUrl;

URL url;
try {
    url = new URL(inputUrl);
} catch (MalformedURLException e) {
    Log.v("myApp", "bad url entered");
}
if (url == null)
    userEnteredBadUrl();
else
    continue();
Zaid Daghestani
  • 8,555
  • 3
  • 34
  • 44
  • 2
    Idk how Java works, but in .NET I tried something similar. This solution doesn't seem robust. Attaching http:// to anything returns a valid URI for me. Perhaps almost anything with http:// in front is valid. lol. – Mark13426 Jan 30 '16 at 03:07
8

After looking around I tried to improve Zaid's answer by removing the try-catch block. Also, this solution recognizes more patterns as it uses a regex.

So, firstly get this pattern:

// Pattern for recognizing a URL, based off RFC 3986
private static final Pattern urlPattern = Pattern.compile(
    "(?:^|[\\W])((ht|f)tp(s?):\\/\\/|www\\.)"
            + "(([\\w\\-]+\\.){1,}?([\\w\\-.~]+\\/?)*"
            + "[\\p{Alnum}.,%_=?&#\\-+()\\[\\]\\*$~@!:/{};']*)",
    Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL);

Then, use this method (supposing str is your string):

    // separate input by spaces ( URLs don't have spaces )
    String [] parts = str.split("\\s+");

    // get every part
    for( String item : parts ) {
        if(urlPattern.matcher(item).matches()) { 
            //it's a good url
            System.out.print("<a href=\"" + item + "\">"+ item + "</a> " );                
        } else {
           // it isn't a url
            System.out.print(item + " ");    
        }
    }
Enkk
  • 376
  • 3
  • 6
  • 2
    This doesnt recognize link "example.com", but the idea of splitting string by space and then checking is brilliant. Just a bit of regex tweaking is needed to reach perfection. Edit: Instead of the regex you provided, one can use android.util.Patterns.WEB_URL like this: `android.util.Patterns.WEB_URL.matcher("example.com").matches();` – lxknvlk Oct 18 '16 at 15:10
4

Based on Enkk's answer, i present my solution:

public static boolean containsLink(String input) {
    boolean result = false;

    String[] parts = input.split("\\s+");

    for (String item : parts) {
        if (android.util.Patterns.WEB_URL.matcher(item).matches()) {
            result = true;
            break;
        }
    }

    return result;
}
lxknvlk
  • 2,744
  • 1
  • 27
  • 32
2

Old question, but found this, so I thought it might be useful to share. Should help for Android...

ethan123
  • 1,084
  • 2
  • 14
  • 26
1

I would first use java.util.Scanner to find candidate URLs in the user input using a very dumb pattern that will yield false positives, but no false negatives. Then, use something like the answer @ZedScio provided to filter them down. For example,

Pattern p = Pattern.compile("[^.]+[.][^.]+");
Scanner scanner = new Scanner("Hey Dave, I found this great site called blah.com you should visit it");
while (scanner.hasNext()) {
    if (scanner.hasNext(p)) {
        String possibleUrl = scanner.next(p);
        if (!possibleUrl.contains("://")) {
            possibleUrl = "http://" + possibleUrl;
        }

        try {
            URL url = new URL(possibleUrl);
            doSomethingWith(url);
        } catch (MalformedURLException e) {
            continue;
        }
    } else {
        scanner.next();
    }
}
John Watts
  • 8,717
  • 1
  • 31
  • 35
1

If you don't want to experiment with regular expressions and try a tested method, you can use the Apache Commons Library and validate if a given string is an URL/Hyperlink or not. Below is the example.

Please note: This example is to detect if a given text as a 'whole' is a URL. For text that may contain a combination of regular text along with URLs, one might have to perform an additional step of splitting the string based on spaces and loop through the array and validate each array item.

Gradle dependency:

implementation 'commons-validator:commons-validator:1.6'

Code:

import org.apache.commons.validator.routines.UrlValidator;

// Using the default constructor of UrlValidator class
public boolean URLValidator(String s) {
    UrlValidator urlValidator = new UrlValidator();
    return urlValidator.isValid(s);
}

// Passing a scheme set to the constructor
public boolean URLValidator(String s) {
    String[] schemes = {"http","https"}; // add 'ftp' is you need
    UrlValidator urlValidator = new UrlValidator(schemes);
    return urlValidator.isValid(s);
}

// Passing a Scheme set and set of Options to the constructor
public boolean URLValidator(String s) {
    String[] schemes = {"http","https"}; // add 'ftp' is you need. Providing no Scheme will validate for http, https and ftp
    long options = UrlValidator.ALLOW_ALL_SCHEMES + UrlValidator.ALLOW_2_SLASHES + UrlValidator.NO_FRAGMENTS;
    UrlValidator urlValidator = new UrlValidator(schemes, options);
    return urlValidator.isValid(s);
}

// Possible Options are:
// ALLOW_ALL_SCHEMES
// ALLOW_2_SLASHES
// NO_FRAGMENTS
// ALLOW_LOCAL_URLS

To use multiple options, just add them with the '+' operator

If you need to exclude project level or transitive dependencies in the grade while using the Apache Commons library, you may want to do the following (Remove whatever is required from the list):

implementation 'commons-validator:commons-validator:1.6' {
    exclude group: 'commons-logging'
    exclude group: 'commons-collections'
    exclude group: 'commons-digester'
    exclude group: 'commons-beanutils'
}

For more information, the link may provide some details.

http://commons.apache.org/proper/commons-validator/dependencies.html

Ram Iyer
  • 1,621
  • 1
  • 23
  • 25
0

This function is working for me

private boolean containsURL(String content){
    String REGEX = "\\b(https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]";
    Pattern p = Pattern.compile(REGEX,Pattern.CASE_INSENSITIVE);
    Matcher m = p.matcher(content);
    return m.find();
}

Call this function

boolean isContain = containsURL("Pass your string here...");
Log.d("Result", String.valueOf(isContain));

NOTE :- I have tested string containing single url

Community
  • 1
  • 1
Ketan Ramani
  • 4,874
  • 37
  • 42
0

You need to use URLUtil isNetworkUrl(url) or isValidUrl(url)

Psijic
  • 743
  • 7
  • 20
0
public boolean isURL(String text) {
    return text.length() > 3 && text.contains(".")
            && text.toCharArray()[text.length() - 1] != '.' && text.toCharArray()[text.length() - 2] != '.'
            && !text.contains(" ") && !text.contains("\n");
}
-2

The best way is to to set the property autolink to your textview, Android will recognize, change the appearance and make clickable a link anywhere inside the string.

android:autoLink="web"

Harol
  • 93
  • 1
  • 4