3

i tried load XML document and save an exact copy. Problem is all line feed symbol (#10, hex 0A) are replaced with carriage return. (#13#10, hex 0D0A)

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<!-- Information -->
<AddInsList>
  <AddInItem ID="name1">
    <Title DefaultText="Some text">
      <tag1><![CDATA[Some text]]></tag1>
    </Title>
    <Description DefaultText="some informations">
      <tag1><![CDATA[**Some text with line feed symbols 0A**]]></tag1>
    </Description>
  </AddInItem>
</AddInsList>

my code:

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
  Dialogs, XMLIntf, XMLDoc, ActiveX, xmldom, StdCtrls, ComCtrls;

var
  Doc : IXMLDocument;

Begin
  Doc:=NewXMLDocument;

  //Prevent to change white spaces to tabulators
  Doc.ParseOptions := Doc.ParseOptions+[poValidateOnParse]+[poPreserveWhiteSpace];

  Doc.LoadFromFile('C:\test.xml');
  Doc.SaveToFile('C:\exact.xml');
End;
Nafalem
  • 261
  • 5
  • 16
  • That is not your real code. It doesn't compile. If you want help here, post the **actual** code you're having problems with; posting fake code often hides the actual problem. – Ken White Jun 19 '13 at 13:09
  • 3
    @user why is this a problem? It's not because you have a CDATA section that it's contents are not parsed. In this case the xml writer decides to change the unix style linebreaks with windows style linebreaks. At the end the xml is still valid? – whosrdaddy Jun 19 '13 at 14:57
  • @whosrdaddy #10 symbols are only in CDATA i think, is there way to keep both delimeter #10 and #13#10 in xml document. – Nafalem Jun 19 '13 at 17:01
  • @Ken White i skipped the Tform part so you have to paste it to correct section uses var etc. – Nafalem Jun 19 '13 at 17:03
  • @user2501001: Yes, I'm aware. You also had a missing `Doc` before the `ParseOptions :=`, and your XML is invalid (the `AddInsList` closing tag is missing `/`). Your code posted here should *compile* and demonstrate the problem you're trying to solve. – Ken White Jun 19 '13 at 17:06
  • Sorry about a missing code, now it should be ok. I added a 0A symbol also outside CDATA between 0A20202020 (in HEX editor) and it was replaced too. Problem i try to solve, keep both symbols 0A and 0D0A in saved document. – Nafalem Jun 20 '13 at 04:44
  • I'd say we don't have a chance in Delphi to define linebreaks of XML writter when serializing DOM like .net programmers have in XmlWritterSettings.NewLineChars. But as @whosrdaddy wrote, you should don't care because each parser is obliged to normalize linebreaks to [LineFeed](http://www.w3.org/TR/REC-xml/#sec-line-ends) – pf1957 Jun 20 '13 at 06:11
  • You should not have any expectation that loading XML and saving it will create an exact copy. Any of the following is subject to change: whitespace, use of new lines, indentation, tabs and more... As you have discovered one of the changes is the `LineBreak` symbol. If however, you had an expectation that the content of CDATA would not change, then you might find the following question insightful: http://stackoverflow.com/q/2784183/224704 – Disillusioned Sep 30 '13 at 11:36

1 Answers1

0

XML parsing normalizes line breaks; XML serialization is responsible for deciding whether to convert them back to CRLF form. See http://www.w3.org/TR/REC-xml/#sec-line-ends

keshlam
  • 7,931
  • 2
  • 19
  • 33