0

Hi i am new to Java network Package.

Need: I wanted to access a particular webpage and need to get the html contents of that page through java code.I used httpURLConnection to access the URL.

Problem with that website: I am getting response code of 403 to that particular website whereas when i am able to access other websites with the same code.

Details about that problematic website: The problematic website is a http website,when accessed manually from web browser, i am able to access the webpage & able to access the Html contents of that webpage.

Problematic URL: http://redbus2us.com/h1b-visa-sponsors/index.php?searchText=a&searchYear=14&action=search&pn=2

Correctly Working URL: http://www.mkyong.com/all-tutorials-on-mkyong-com/

Code:

String base_url="http://redbus2us.com/h1b-visa-sponsors/index.php?searchText=a&searchYear=14&action=search&pn=",full_url;
int end_url=1;
try
   {
    for(;end_url<36302;end_url++)
        {
        full_url=base_url+end_url;
        URL url=new URL(full_url);
    HttpURLConnection url_connect=(HttpURLConnection)url.openConnection();
    System.out.println(url+","+url_connect.getResponseCode());
       }
  }

Please suggest me whether there is problem in my code or problem with that particular website .

Vijay Manohar
  • 473
  • 1
  • 7
  • 22

1 Answers1

2

The site refuses to serve content to the default java user agent. You want to set the user agent to something that looks like a browser, for example:

url_connect.setRequestProperty("User-Agent", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36");
Diego Basch
  • 12,764
  • 2
  • 29
  • 24