1

I am trying to run google search api from the SO link below :- How can you search Google Programmatically Java API Here is my code below:-

public class RetrieveArticles {

    public static void main(String[] args) throws UnsupportedEncodingException, IOException {
        // TODO Auto-generated method stub

        String google = "http://www.google.com/news?&start=1&q=";
        String search = "Police Violence in USA";
        String charset = "UTF-8";
        String userAgent = "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"; // Change this to your company's name and bot homepage!

        Elements links = Jsoup.connect(google + URLEncoder.encode(search, charset)).userAgent(userAgent).get().children();

        for (Element link : links) {
            String title = link.text();
            String url = link.absUrl("href"); // Google returns URLs in format "http://www.google.com/url?q=<url>&sa=U&ei=<someKey>".
            url = URLDecoder.decode(url.substring(url.indexOf('=') +1, url.indexOf('&')), "UTF-8");

            if (!url.startsWith("http")) {
                continue; // Ads/news/etc.
            }
           System.out.println("Title: " + title);
           System.out.println("URL: " + url);
        }   
   }
}

When I try to run this I get the below error . Can anyone please help me fix it .

Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: -1
    at java.lang.String.substring(String.java:1911)
    at google.api.search.RetrieveArticles.main(RetrieveArticles.java:34)

Thanks in advance .

Community
  • 1
  • 1
Newbie
  • 2,664
  • 7
  • 34
  • 75

2 Answers2

2

The problem is here :

url.substring(url.indexOf('=') +1, url.indexOf('&'))

Either url.indexOf('=') or url.indexOf('&') returned -1, which is an illegal argument in subString.

You should validate the url you are parsing before assuming that it contains = and &.

Eran
  • 387,369
  • 54
  • 702
  • 768
  • minor point: url.indexOf('=') shouldn't be a problem since OP adds 1 to that value. – aioobe Dec 23 '14 at 12:58
  • @aioobe You got a point, though if the url doesn't contain `=`, I don't think getting the substring starting from the first character of the url is what the OP wants. – Eran Dec 23 '14 at 13:04
0

add System.Out.Println(Url); before the

url = URLDecoder.decode(url.substring(url.indexOf('=') +1, url.indexOf('&')), "UTF-8");

then you will come to know, wether url string is containg '=','&' or not .

Venkat Kondeti
  • 81
  • 1
  • 12