I have been doing web scraping for a few months now and always get stuck on pages which load data using javascript.
I have a certain degree of success on such pages using HTMLunit but sometimes Htmlunit throws out these unusual exceptions and eventually doesnot load pages. Well I have to say it has been a hit and miss using HTMLunit.
Is there a concrete way to achieve it ??
But also on my part I haven't dug deep on HTMLunit. So what would your suggestion be ?? Should I stick around with HTMLunit or are there other good methods (libraries) to achieve javascript processing ??
Just for the record I am using Java as my primary language.