1

I have to write a regular expression to match child price i.e "18.99", since there are multiple span class with same class name i.e "currency & price-value", I wanted to write regex from CHILD, again here 4-11 is dynamic data, it can change.

<p class="price">CHILD 4-11yrs<br /> <span class="currency">&pound;</span> <span class="price-value">18.99</span></p>

Wanted a regex which identifies from CHILD to fetch the price. Can anyone help me with this. Thanks in advance.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
Chaitra
  • 77
  • 1
  • 1
  • 4

2 Answers2

1

Since you just want to check if the span tag you need to get the value of contains a literal substring CHILD, you may as well use an XPath_Extractor and the following XPath query:

//span[parent::p[contains(text(),'CHILD')] and @class='price-value']/text()

Details:

  • //span - get me a span tag...
  • [parent::p[contains(text(),'CHILD')] - whose parent tag is p and whose value contains CHILD substring
  • and - AND...
  • @class='price-value'] - the class attribute value is price-value...
  • /text() - and fetch me the value of that span.

NOTE: If the p tag starts with the CHILD, you may as well use starts-with:

//span[parent::p[starts-with(text(),'CHILD')] and @class='price-value']/text()
                 ^^^^^^^^^^^
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Hi Wiktor, Thank you. Is it possible to extract the same by using Regular Expression Extractor. ? – Chaitra Sep 06 '16 at 09:28
  • Yes, but do you really want to use such an incomprehensible [`

    [^<]*(?:<(?!/p>)[^<]*)*([^<]+)`](https://regex101.com/r/zH6vT2/1)?

    – Wiktor Stribiżew Sep 06 '16 at 09:32
  • XPath is the only valid way to proceed. Trying to parse HTML with regex [only causes headache](http://stackoverflow.com/a/1732454/3832970). – Wiktor Stribiżew Sep 06 '16 at 10:37
  • @WiktorStribiżew, CSS / JQuery is the correct way of extracting data from HTML. XPath is slow and prefered for XML content – UBIK LOAD PACK Sep 29 '16 at 06:40
  • @UBIKLOADPACK: How slow is it? If we are talking about microseconds, then it is not that important, right? Also, can you make sure with the selectors that you get the correct `span.price-value`, not the arbitrary one? There are some restrictions as you can see. – Wiktor Stribiżew Sep 29 '16 at 06:43
  • you can do with css/jquery what you can do with xpath. Xpath is slow in terms of ms not micro. – UBIK LOAD PACK Sep 29 '16 at 06:48
0

To extract HTML content, it is much better and easier to use CSS/ JQuery Extractor.

To extract what you need expression will be:

span.price-value

Leave attribute empty and that's it

UBIK LOAD PACK
  • 33,980
  • 5
  • 71
  • 116