0

I want to get data from this web site with web scraping. http://www.upmandiparishad.in/commodityWiseAll.aspx enter image description here

I used JSoup before for more static HTML sites, but this one is difficult for me because before I get the HTML table on the site have to click one button and I don't know if it's possible to use JSoup to manipulate the button.

After click this button I get a HTML table.

So How can i achieve this?

Thanks in advance

user3456343
  • 252
  • 3
  • 7
  • 21
  • possible duplicate of [scrape data from website?](http://stackoverflow.com/questions/22803854/scrape-data-from-website) – rene Apr 03 '14 at 12:16

2 Answers2

0

It seems that you've used JSoup as a html parser but not as a request/response handler. I'll give you two options:

Option 1:

  • Figure out what's happening when you press that button. A button usually is just a POST request, so get info of that POST request (Google chrome dev tools is your friend).
  • Emulate that POST using the JSoup Connect interface (check the POST method)
  • Parse the html code with JSoup as you already know

Option 2:

  • User a proper tool to handle a browser instance (for example Selenium Webdriver) and perform whatever action you want in your webpage (fill forms, submit....).
  • Once you are in the webpage that you want, get html code and use it with JSoup to extract your info.

Good luck!

Curro
  • 1,331
  • 1
  • 13
  • 24
0

Let's say the page has many input tags, Like a text input and a password, I'm guessing you know this. Now you have to pass data("name here","value here").post() that will manipulate a button.

Example: suppose this is the html code <INPUT TYPE=SUBMIT NAME="submit" VALUE="SUBMIT" ALIGN = "center">

then this would be your automating command!

Jsoup.connect("<url here>").userAgent("Chrome").data("submit","SUBMIT").post();
nj-ath
  • 3,028
  • 2
  • 25
  • 41