1

I saw the existing question with the same title but that was a different question.

Let's say that I want to find elements that has "conGraph" in the class. I have tried

//div[contains(@class,'conGraph')]

It correctly got

<div class='conGraph mr'>

but it also falsely got

<div class='conGraph_wrap'>

which is not the same class at all. For this case only, I could use 'conGraph ' and get away with it, but I would like to know the general solution for future use.

In short, I want to get elements whose class contains "word" like "word", "word word2" or "word3 word", etc, but not like "words" or "fake_word" or "sword". Is that possible?

Damn Vegetables
  • 11,484
  • 13
  • 80
  • 135
  • Can you please clarify how the linked question does not answer your problem? ("I'm not @#$@# going to look up how to match words with regex" would be fine, but somewhat weak explanation) – Alexei Levenkov Jul 28 '20 at 05:27
  • @AlexeiLevenkov The linked question was finding a whole matching only. That is, "word" only, not "word word2" or "word3 word". – Damn Vegetables Jul 28 '20 at 05:28
  • Does this answer your question? [How to find the exact word using a regex in Java?](https://stackoverflow.com/questions/9464261/how-to-find-the-exact-word-using-a-regex-in-java) – Alexei Levenkov Jul 28 '20 at 05:31
  • Assuming you are fine with `match` duplicate should do... you may want to edit the question to clarify if that is acceptable. – Alexei Levenkov Jul 28 '20 at 05:32
  • @AlexeiLevenkov On the online regex test site, `\bword\b` worked, but `"//div[matches(@class,'\bconGraph\b')]"` causes `System.Xml.XPath.XPathException: 'Namespace Manager or XsltContext needed. This query has a prefix, variable, or user-defined function.'`. – Damn Vegetables Jul 28 '20 at 05:38
  • So what? Just pick a parser that is XPath 2.0 compatible... Consider a possibility that you did not provide enough information in the question to give you useful answer. Assuming you need one for regular .Net it would be somewhat pain (normalize-space + contains... which still not going to give you good word boundaries). Stare at your requirements and use LINQ-to-XML like everyone else :) – Alexei Levenkov Jul 28 '20 at 05:50

1 Answers1

2

One option could be to use 4 conditions (exact term + 3 contains function with whitespace support) :

For the first condition, you search the exact term in the attribute content. For the second, the third and the fourth you specify all the whitespace variants.

Data :

<div class='word'></div>
<div class='word word2'></div>
<div class='word word3'></div>
<div class='swords word'></div>
<div class='swords word words'></div>
<div class='words'></div>
<div class='fake_word'></div>
<div class='sword'></div>

XPath :

//div[@class="word" or contains(@class,"word ") or contains(@class," word") or contains(@class," word ")]

Output :

<div class='word'></div>
<div class='word word2'></div>
<div class='word word3'></div>
<div class='swords word'></div>
<div class='swords word words'></div>
E.Wiest
  • 5,425
  • 2
  • 7
  • 12
  • It seems with C# (HtmlAgilityPack), this is the only way. I thought there would be a clean solution for this, since I think many people would want to search for elements containing a specific CSS class, but apparently, I am one of the the few minorities who needs this. – Damn Vegetables Jul 28 '20 at 18:53