I ran into a little problem when using XPath to query some HTML files in C#.
Ok, first here's a sample HTML:
<table id="theTable">
<tbody>
<tr class="theClass">A</tr>
<tr class="theClass">B</tr>
<tr>1</tr>
<tr>2</tr>
<tr>3</tr>
<tr>4</tr>
<tr>5</tr>
<tr class="theClass">C</tr>
<tr class="theClass">D</tr>
<tr>6</tr>
<tr>7</tr>
<tr>8</tr>
<tr>9</tr>
<tr>10</tr>
<tr>11</tr>
<tr>12</tr>
<tr>13</tr>
<tr>14</tr>
<tr>15</tr>
<tr class="theClass">E</tr>
<tr class="theClass">F</tr>
<tr>16</tr>
<tr>17</tr>
<tr>18</tr>
<tr>19</tr>
<tr>20</tr>
<tr>21</tr>
<tr>22</tr>
</tbody>
</table>
Now, what I'm trying to do is to get only those elements that are between the B and C nodes (1,2,3,4,5,).
Here's what I tried so far:
using System;
using System.Xml.XPath;
namespace Test
{
class Test
{
static void Main(string[] args)
{
XPathDocument doc = new XPathDocument("Test.xml");
XPathNavigator nav = doc.CreateNavigator();
Console.WriteLine(nav.Select("//table[@id='theTable']/tbody/tr[preceding-sibling::tr[@class='theClass'] and following-sibling::tr[@class='theClass']]").Count);
Console.WriteLine(nav.Select("//table[@id='theTable']/tbody/tr[preceding-sibling::tr[@class='theClass'][2] and following-sibling::tr[@class='theClass'][4]]").Count);
Console.ReadKey(true);
}
}
}
This code, ran over the above HTML, outputs 19 and 5.
So only the second XPath expression works but that only because it searches for elements that have two elements with class=theClass
before them and 4 after them.
My problem starts now. I want to write a single expression that will return only the first group of elements that come after a <td class="theClass"></td>
tag, no matter how many more groups are following it.
If I run my code over this HTML
<table id="theTable">
<tbody>
<tr class="theClass">A</tr>
<tr class="theClass">B</tr>
<tr>1</tr>
<tr>2</tr>
<tr>3</tr>
<tr>4</tr>
<tr>5</tr>
<tr>6</tr>
</tbody>
</table>
it will output 0 and 0.
So it's no good.
Does anybody have any ideas?
Thank you!