I am trying to get the URL of the first search result. So far, I have tried converting the page to HTML using InputStream
and AsyncTask
. and then reading the string, stripping out the first URL using java regex.
String str = result;
String regex = "\\b(https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
if (matcher.find()) {
System.out.println(matcher.group());
Toast.makeText(getBaseContext(), matcher.group(), Toast.LENGTH_LONG).show();
}
My code works very well stripping out the first URL from an HTML file, but I have noticed that there are no URL's in the HTML file when I save it using an android device. There must be a better way of doing this.