How to correctly get A or B using xpath

Question

I am trying to use Xpath to search for nodes within an html block.

I have found the HTML can have random XML nodes. eg <john></john> or <john@gmail.com></john@gmail.com>

How do I structure the xpath query to find all instances in a single search.

I've tried the following but not having luck.

 //john@gmail.com|//anotheremail@gmail.com|//john|//anotheremail
 //john@gmail.com|anotheremail@gmail.com|john|anotheremail
 //john@gmail.com or anotheremail@gmail.com or john or anotheremail

But it doesn't produce the result set.

If I search for them individually, I can get matches.

What am I doing wrong here?

Isn't it //(john@gmail.com|another@gmail.com)? Just guessing. I'm on my phone and can't test right now — patrick, Sep 07 '17 at 23:52
Since you've tagged this with [tag:php], how are you parsing this? `DOMDocument` does **not** like those node names — Phil, Sep 08 '17 at 00:01
`` is not a well-formed start tag in XML -- element names cannot have `@` in them -- so don't expect XPath or any other XML tools to help here until you fix that. — kjhughes, Sep 08 '17 at 00:26
Unfortunately some mail client that sent this email (which is the HTML), inserted these tags to mark the start of the history/quote. So I can't fix it, I need to be able to handle it which is what I'm triyng now — Solvision, Sep 08 '17 at 00:35
Yes using PHP and DomDocument. If it wont like/find ndoes with @ in the name, is there are way to search for that some other way? but still using an xpath query — Solvision, Sep 08 '17 at 00:37
If you can't fix it at the source, pre-process it **as a text file.** If you can't repair it as a text file to be well-formed XML, then you can't use XML libraries. To be conformant, an XML library ***necessarily*** has to reject tags such as `john@gmail.com`. — kjhughes, Sep 08 '17 at 01:35
Possible duplicate of [How to parse invalid (bad / not well-formed) XML?](https://stackoverflow.com/questions/44765194/how-to-parse-invalid-bad-not-well-formed-xml) — kjhughes, Sep 08 '17 at 01:38
@kjhughes “rule breaking is rarely bound by rules” what a great phrase. — ishegg, Sep 08 '17 at 02:16

score 0 · Answer 1 · answered Sep 08 '17 at 08:45

0

The wording of your questions sounds kinda vague, but I suppose you'd like to get all nodes in your html block? Isn't that as simple as an xpath of //*?

answered Sep 08 '17 at 08:45

Simon Baars

1,877
21
38

How to correctly get A or B using xpath

1 Answers1