0

i have method which can return me to array of links in string, but this work only if link have 'http' or 'www' prefix ( http:// site.com or www.site.com) . and also need to detect links without prefix just site.com Please help me

ArrayList retrieveLinks(String text) {
ArrayList links = new ArrayList();

String regex = "\\(?\\b(http://|https://|www[.])[-A-Za-z0-9+&@#/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#/%=~_()|]";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(text);
while(m.find()) {
String urlStr = m.group();
char[] stringArray1 = urlStr.toCharArray();

if (urlStr.startsWith("(") && urlStr.endsWith(")"))
{

    char[] stringArray = urlStr.toCharArray();

    char[] newArray = new char[stringArray.length-2];
    System.arraycopy(stringArray, 1, newArray, 0, stringArray.length-2);
    urlStr = new String(newArray);
   // System.out.println("Finally Url ="+newArray.toString());

}
//System.out.println("...Url..."+urlStr);
links.add(urlStr);
}
return links;
}
eltabo
  • 3,749
  • 1
  • 21
  • 33
user2052497
  • 42
  • 1
  • 5

1 Answers1

0

Not commenting on the rest of the source code

Make the prefix optional, using a ? after the group that declares the possible prefixes.

String regex = "\\(?\\b(http://|https://|www[.])?[-A-Za-z0-9+&@#/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#/%=~_()|]";

See live test here.

Jorge Ferreira
  • 96,051
  • 25
  • 122
  • 132
  • This whole regex is really vague, the prefix might be the only thing differentiating an URL from a random string... OP you might want to work on the rest of the regex first (you can find some sample online already for URL validation, common issue). Also, among other things, `http://|https://` really is just `https?://`. – Robin Jan 24 '14 at 00:15