4

it's very simple there is an HTML file and there is a div with variable id like that

<div id="abc_1"><div>

the integer part of the id is variable so it could be abc_892, abc_553 ...etc

what is the best query to get that ?

Zamblek
  • 789
  • 1
  • 12
  • 21
  • Reference: [Using regex to filter attributes in xpath with php](http://stackoverflow.com/q/6823032/367456) (Jul 2011) – hakre Aug 17 '15 at 05:34

2 Answers2

6
//div[starts-with(@id, "abc_")]
goat
  • 31,486
  • 7
  • 73
  • 96
2

The currently accepted answer selects such unwanted elements as:

<div id="abc_xyz"/>

But only such div elements must be accepted, whose id not only starts with "abc_" but the substring following the _ is a representation of an integer.

Use this XPath expression:

//div
   [@id[starts-with(., 'abc_') 
      and 
        floor(substring-after(.,'_')) 
       = 
        number(substring-after(.,'_')) 
       ]
   ]

This selects any div element that has an id attribute whose string value starts with the string "abc_" and the substring after the - is a valid representation of an integer.

Explanation:

Here we are using the fact that in XPath 1.0 this XPath expression:

floor($x) = number($x)

evaluates to true() exactly when $x is an integer.

This can be proven easily:

  1. If $x is an integer the above expression evaluates to true() by definition.

  2. If the above expression evaluates to true(), this means that neither of the two sides of the equality are NaN, because by definition NaN isn't equal to any value (including itself). But then this means that $x is a number (number($x) isnt NaN) and by definition, a number $x that is equal to the integer floor($x) is an integer.

Alternative solution:

//div
   [@id[starts-with(., 'abc_') 
      and 
        'abc_' = translate(., '0123456789', '')
       ]
   ]
Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
  • could you explain why that works? I'm not very familiar with xpath, so I'm guessing floor() will return a value that is never equal to itself, like sql's ternary logic(eg, `null = null` in sql is always false)? thanks. – goat Apr 26 '12 at 23:16
  • @chris: Done. BTW there was a slight inaccuracy in the expression and this is fixed now. – Dimitre Novatchev Apr 27 '12 at 01:20
  • @chris: You are welcome. Yes, XPath (even 1.0) is a very powerful language and tool for elegant solutions. – Dimitre Novatchev Apr 27 '12 at 04:40
  • Well, as I said I think that additional check is probably unnecessary, but I'm sure it'll probably be useful to some. Just out of curiosity though, would `//div[@id[translate(.,'0123456789','') = 'abc_']]` not be faster? – Flynn1179 Apr 27 '12 at 10:05
  • @Flynn1179: Both ways are O(N) -- and if one is faster this would depend on implementation. An XPath engine optimizer may or may not recognize and optimize a particular expression. I prefer `floor($x) = $x` because this is more readable and understandable and translates nicely into "type-checking". – Dimitre Novatchev Apr 27 '12 at 11:41