0

I have a file that is sent over https and I am handling with a php script.

The data is accepted using:

$data = file_get_contents('php://input')

If written to a file it is written as one line.

Due to our internal systems (IBM power 7) I'm told by I.T that I need to add a carriage return after each xml element.

So the file currently opens in an editor as :

<root><element1><element2></element2></element1></root>

I need it to be :

<root>
<element1>
<element2></element2>
</element1>
</root>

Which requires inserting "\n" after each closing tag and a tag with children.

Any ideas?

rsmarsha
  • 358
  • 3
  • 13
  • i am not sure but have you tried `\n` in the end of the element?? – jogesh_pi Mar 27 '12 at 11:14
  • Sounds like someone is using Notepad to open your files, instead of a real editor. – Matt Ball Mar 27 '12 at 11:15
  • It's valid XML either way, you need to get it formatted for ease of reading? – Francisc Mar 27 '12 at 11:16
  • You could use XMLReader and XMLWriter to parse the file - in particle XMLWriter allows you to output arbitrary text, whereas SimpleXML does not (afaik). – halfer Mar 27 '12 at 11:17
  • @Francisc - no, I think the OP has a legacy processing system that will only read the XML if it is formatted this way. – halfer Mar 27 '12 at 11:18
  • I take the file sent over https and retrieve that data using $data = file_get_contents('php://input'). If i write that to a file it writes as one line, I need to write it as multiple lines as mentioned above. – rsmarsha Mar 27 '12 at 11:18
  • halfer is write the system is an ibm power 7 and requires elements to be on their own lines. – rsmarsha Mar 27 '12 at 11:20
  • Why isn't there a linebreak in the third line? – Tim Pietzcker Mar 27 '12 at 11:21
  • So, how did you do the write? – ajreal Mar 27 '12 at 11:27
  • The write is done using a standard fwrite dumping the received string into an xml file. @tim there is no break on the 3rd line as it's one element with no children. – rsmarsha Mar 27 '12 at 11:32

2 Answers2

2

The formatOutput option will do it.

ceving
  • 21,900
  • 13
  • 104
  • 178
  • I tried: $doc = new DOMDocument(); $doc->formatOutput = true; $doc->loadXML($data); $doc->save("data/test.xml"); but it just saves in the same format with no returns. – rsmarsha Mar 27 '12 at 11:47
  • Set formatOutput after loading the file as suggested [here](http://stackoverflow.com/questions/768215/php-pretty-print-html-not-tidy). – ceving Mar 27 '12 at 11:58
  • Tried that also and notepad still shows it as on single lines. :( – rsmarsha Mar 27 '12 at 12:03
  • @rsmarsha does notepad show tiny boxes instead of line breaks? try open the file in a real editor (e.g. [Notepad++](http://notepad-plus-plus.org/)) – Kaii Mar 27 '12 at 12:08
  • It shows nothing, an example is: "xml version="1.0" encoding="ISO-8859-1"?> LXKCMSSalesOrderRequestCMDBOut SALESORDERREQUEST " – rsmarsha Mar 27 '12 at 12:59
  • In something like netbeans or pspad it formats the xml fine. Our internal system though see's it as the one line like notepad. I did manage to drop it into an array and then write as a string but I cut some xml off with the code, might try to work that again. – rsmarsha Mar 27 '12 at 13:00
  • The DOM method does work it turns out there was confusion with our I.T looking at it in notepad which expects CRLF and the file is generated using LF so I think it's sorted, testing it now. – rsmarsha Mar 27 '12 at 13:42
  • @rsmarsha thats why i asked you to use a "real editor" like Notepad++ ;) – Kaii Mar 27 '12 at 16:16
2

If it's just for inserting line-breaks, a regex will do fine.

However, do NOT start parsing XML with Regular Expressions!

Try this:

$xml = preg_replace(
         '=(<(.*?)>)(?![^<>]*</\2>|$)=s', 
         "\\1\n", 
         file_get_contents('php://input') ); 

The expression matches all XML tags that are not followed by EOF or a matching closing tag using a negative lookahead assertion [(?!..)] and a backreference.

Community
  • 1
  • 1
Kaii
  • 20,122
  • 3
  • 38
  • 60
  • 1
    Regular expression hacks will fail on [CDATA](http://en.wikipedia.org/wiki/CDATA) parts. – ceving Mar 27 '12 at 11:45
  • @ceving you are right. If there is CDATA inside the string of the OP, this regex will possibly insert line breaks in between. An exception for CDATA could be worked into the regex, too. However, the correct solution is using an XML DOM parser, like ceving suggests. – Kaii Mar 27 '12 at 11:49