0

I am trying to download the contents of a site. The site is a magneto site where one can filter results by selecting properties on the sidebar. See zennioptical.com for a good example.

I am trying to download the contents of a site. So if we are using zennioptical.com as an example i need to download all the rectangular glasses. Or all the plastic etc..

So how do is send a request to the server to display only the rectangular frames etc?

Thanks so much

BoltClock
  • 700,868
  • 160
  • 1,392
  • 1,356
FredTheLover
  • 1,009
  • 10
  • 19
  • Can you clarify what you mean by having to "execute AJAX"? If it's a simple AJAX request you must submit, that's easy; if you need to run JavaScript code, that is far, far more complicated. – cheeken Jun 22 '12 at 22:46
  • "I am trying to download the contents of a site." Are you screen-scraping someone else's site, or is this your site? – Diodeus - James MacFarlane Jun 25 '12 at 20:12
  • 1
    Rather contact and ask them for a public webservice API. Webservices usually allows you to get specific data in easily parseable formats like XML, JSON or CSV. It also costs much less bandwidth and development headache for the both sides. Note that plain website scraping often violates the terms of service of the site in question and you risk to get IP-banned for that or even worse. Make sure that you've read it. – BalusC Jun 25 '12 at 20:21
  • @BalusC That's really the right answer, but I don't get the impression the OP is going that direction >.> +1 though – Windle Jun 26 '12 at 13:51

1 Answers1

1

You basic answer is you need to do a HTTP GET request with the correct query params. Not totally sure how you are trying to do this based on your question, so here are two options.

If you are trying to do this from javascript you can look at this question. It has a bunch of answers that show how to perform AJAX GETs with the built in XMLHttpRequest or with jQuery.

If you are trying to download the page from a java application, this really doesn't involve AJAX at all. You'll still need to do a GET request but now you can look at this other question for some ideas.

Whether you are using javascript or java, the hard part is going to be figuring out the right URLs to query. If you are trying to scrape someone else's site you will have to see what URLs your browser is requesting when you filter the results. One of the easiest ways to see that info is in Firefox with the Web Console found at Tools->Web Developer->Web Console. You could also download something like Wireshark which is a good tool to have around, but probably overkill for what you need.

EDIT

For example, when I clicked the "rectangle frames" option at zenni optical, this is the query that fired off in the Web Console:

[16:34:06.976] GET http://www.zennioptical.com/?prescription_type=single&frm_shape%5B%5D=724&nav_cat_id=2&isAjax=true&makeAjaxSearch=true [HTTP/1.1 200 OK 2328ms]

You'll have to do a sufficient number of these to figure out how to generate the URLs to get the results you want.

DISCLAIMER

If you are downloading someone's else data, it would be best to check with them first. The owner of the server may not appreciate what they might consider stealing their data/work. And then depending on how you use the data you pull down, you could be venturing into all sorts of ethical issues... Then again, if you are downloading from your own site, go for it.

Community
  • 1
  • 1
Windle
  • 1,385
  • 2
  • 14
  • 33