I would like to write crawler which supports cookies storing operation and sessions. There are two different implementations of java headless browser. HtmlUnit has better support of javascript and perhaps html parsing. But is there are any reason to use HttpUnit for performance of crawler?
Asked
Active
Viewed 1,405 times
2
-
[cockies](http://www.urbandictionary.com/define.php?term=cockie) LOL :)) Careful with the typo's – Armen Tsirunyan Aug 28 '11 at 09:44
-
2I doubt you'll find any performance comparison. Those are not optimized for speed: their goal is mainly to implement unit tests, which don't need top performance. Measure by yourself, but the network will certainly be the bottleneck, not the Java code. – JB Nizet Aug 28 '11 at 09:54
-
I am using HTMLUnit for an application of mine, I basically sped my implementation by disabling CSS, removing java applets and ActiveX from the source. – Opal Jan 24 '13 at 23:21
1 Answers
0
There is a relevant article here, from one of the HtmlUnit developers.
It basically says that, apart from Javascript support, HtmlUnit is more high level that HttpUnit. HtmlUnit also seems to be more actively developed (2 releases in 2014 while HttpUnit has not been updated since 2008).

Yiannis Dermitzakis
- 418
- 6
- 8