0

Let's imagine the following structure :

<div class="text">
  plainjkhtext-identifsdficccccator-123
  <img src="#" />

  PLAINTEXT-identificator-123
  <img src="1" /> 

  </div>

<div>
  <img src="2" /> 
  <img src="1" /> 
</div>

And this XPath:

//div[@class, "text"]/text()[contains(translate(.,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz'), 'plaintext-identificator-123')]/following-sibling::img[position()<3]

Basically, this gets two <img> elements after "plaintext-identificator-123" text.

I need to get the <img> elements even if they don't have the same parent.

<div class="text">
  plainjkhtext-identifsdficccccator-123
  <img src="#" />

  PLAINTEXT-identificator-123
  <b><img src="1" /></b>

  </div>

<div>
  <img src="2" /> 
  <img src="1" /> 
</div>

I cannot use "following" axis though because that would get the <img> element from the second <div>, which doesn't even have the "text" class.

kjhughes
  • 106,133
  • 27
  • 181
  • 240
Tadeáš Jílek
  • 2,813
  • 2
  • 19
  • 32

1 Answers1

0

This XPath,

//img[ancestor-or-self::*/preceding-sibling::node()
       [translate(normalize-space(),'ABCDEFGHIJKLMNOPQRSTUVWXYZ',
                                    'abcdefghijklmnopqrstuvwxyz') 
         = 'plaintext-identificator-123']
       [ancestor::div/@class="text"]]

will limit img elements selected to those following the targeted text within the targeted div, even if the targeted text is within its own markup, as requested.

In XPath 2.0, use lower-case() rather than translate(), or match the text with the case-insensitive i flag of matches().

kjhughes
  • 106,133
  • 27
  • 181
  • 240
  • Works just right. And if the plaintext-identificator-123 must be in the same
    ?
    – Tadeáš Jílek Sep 17 '20 at 20:41
  • Answer updated to ensure that the targeted `text()` node is also within the targeted `div.` – kjhughes Sep 17 '20 at 21:02
  • Thank you! This will only select the elements from the
    where the plaintext-identificator-123 is present, am i right? So if there will be more div@text classes with the identificator it won't cause problems.
    – Tadeáš Jílek Sep 17 '20 at 21:09
  • The text must be preceding-sibling, however it can be surrounded by for example tag. The preceding-sibling is not working with that. – Tadeáš Jílek Sep 17 '20 at 21:14
  • Wow, ok, so instead of testing for preceding text node siblings, check for preceding sibling nodes of any type whose string value equals the targeted text. – kjhughes Sep 17 '20 at 21:24
  • */preceding-sibling::*/text() – Tadeáš Jílek Sep 17 '20 at 21:26
  • I used `node()` rather than `*` so as to still capture the original text-as-sibling option too. – kjhughes Sep 17 '20 at 21:28
  • But that's only one step. If the text was in for example, it wouldn't work. – Tadeáš Jílek Sep 17 '20 at 21:28
  • Actually it should work fine for any depth of surrounding elements because such surrounding doesn't change the ancestor's [***string value***](https://stackoverflow.com/q/34593753/290085). Try it. – kjhughes Sep 17 '20 at 21:31
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/221709/discussion-between-tadeas-jilek-and-kjhughes). – Tadeáš Jílek Sep 18 '20 at 21:30