3

I know that in common case HTML shouldn't be parsed with regex.

But I want to make a performance test for web application. I know for sure how HTML may look like. So I can use regexes to extract some data from page source.

As I do performance test (using Jmeter), I want to take less resources from master machine.

What option will be less resource intensive: XPath, regexes (Jakarta ORO) or Jsoup?

Community
  • 1
  • 1
Andrei Botalov
  • 20,686
  • 11
  • 89
  • 123

1 Answers1

3

As of JMeter 2.8, the answer is Regexp. But it depends of course on Regexp expressions you use. Regexp implementation in JMeter is rather optimized and the main post processing way for correlation.

Regarding JSoup, it would need custom coding based on JSR223 post processor for example.

JMeter 2.9 will introduce a new CSS/JQuery selector based Extractor with 2 possible underlying implementations:

See :

Its performance will be lower than Regexp as it builds a DOM document, but it eases much syntax in Test Plans that don't require ultra-optimised Test Plans.

Finally, regarding XPath, as it builds a DOM Tree:

It has a memory and CPU cost which is higher than regex particularly if you want to extract many elements, an enhancement has been created:

igr
  • 10,199
  • 13
  • 65
  • 111
UBIK LOAD PACK
  • 33,980
  • 5
  • 71
  • 116