I wish to screen scrape several Ajax based websites and simulate clicks which refresh part of the webpage, and then read the updated HTML. Is there any Java library which can do this?
Asked
Active
Viewed 2,942 times
4
-
1possible duplicate of [How do you screen scrape ajax pages?](http://stackoverflow.com/questions/260540/how-do-you-screen-scrape-ajax-pages) – aioobe Jun 29 '11 at 17:53
-
1I think @Zubair is looking for a java side solution rather than a general screen scraper.... Either way apache's HtmlUnit is the way to go. – Michael J. Lee Jun 29 '11 at 18:01
-
Yes, if possible I would like to use a headless server solution, although if it isn't possible then i will have to automate a browser or something – yazz.com Jun 29 '11 at 18:15
-
I did see the other screen scraping question mentioned but the tools it linked to were general purpose tools for the most part – yazz.com Jun 29 '11 at 19:33
3 Answers
6
Use HtmlUnit it's great for this!! It is a headless browser and has the ability to play with clicks, mouse positions and pretty much everything you would want.

Michael J. Lee
- 12,278
- 3
- 23
- 39
1
I think the only way to do this is to embed a browser so that the Javascript is executed and grab the data when the DOM is updated. This related stack overflow question may help.
0
These books should help you (although only the first one is intended to Java developers):

sonnuforevis
- 1,311
- 6
- 22
- 38