1

In case I want to change the text or add an element in XML files, I can just directly convert the file to a string, replace or add elements as a string, then convert it back to XML.

In what use case where that approach is bad? Why do we need to manipulate it using libraries such as XMLdom, Xpath?

Anh
  • 85
  • 2
  • 11
  • 1
    Have you ever tried to parse XML with direct string manipulation? Anything beyond even the most simplistic patch can become a giant pain. [RegEx](https://stackoverflow.com/q/1732348/691711) won't help you either. – zero298 May 05 '20 at 13:10
  • Just to add to other answers: we get a lot of questions on StackOverflow of the form "how do I generate XML with (the attributes in a particular order | no newline between the attributes | decimal character references rather than hex | every element on a single line) because the receiving application will only accept it in that format". XML is about standards, and you'll never conform to the XML standard if you try parsing XML by hand. – Michael Kay May 05 '20 at 18:54

1 Answers1

3

The disadvantage of manipulating XML via string operators is that achieving a parsing-dependent goal for even one particular XML document is already harder than using a proven XML parser. Achieving the goal for equivalent XML document variations will be nearly impossible, especially for anyone naive enough to be considering such an approach in the first place.

Not convinced?

Scan the table of contents of the Extensible Markup Language (XML) 1.0 (Fifth Edition), W3C Recommendation 26 November 2008. If you do not understand everything, your hand-written, poor imitation of an XML parser, will fail, if not on your first test case, on future variations which you're obligated to handle if you wish to claim your code works with XML. To mention just a few challenges, your program should

  1. Report if its input XML is not well-formed.
  2. Handle character and entity references.
  3. Handle comments and CDATA sections.

Tempted to parse XML via string operators, including regex? Don't do it. Use a real XML parser.

kjhughes
  • 106,133
  • 27
  • 181
  • 240