4

In a database, there is a string of +10000 chars that is in XML format. The XML is not well formed and I need to fix it. I have to convert the string (no CRLF's in it) into a file that I can edit sensibly and correct the tags.

I am able to extract the string to an editor, it is the conversion to multiline, indented XML that is tricky. Any help on how to tackle that kind of task?

Thanks in advance.

Steve Hibbert
  • 2,045
  • 4
  • 30
  • 49
  • possible duplicate of [format xml, pretty print](http://stackoverflow.com/questions/4129369/format-xml-pretty-print). Specifically, I've used `xmlstarlet` with good success - it can handle multi-megabyte files with ease. – mellamokb Mar 12 '13 at 14:14
  • Subtlely different from pretty print - the problem was how to convert badly formatted XML that will not parse. – Steve Hibbert Mar 12 '13 at 15:42
  • There is also an XML Tools plugin for Notepad++ that has an option to validate XML syntax (as well as pretty print). – mellamokb Mar 12 '13 at 15:43

4 Answers4

7

A good solution is to run :

xmllint --format file.xml

xmllint is a part of libxml2-utils on debian, see http://www.xmlsoft.org/ (also available for windows)

Gilles Quénot
  • 173,512
  • 41
  • 224
  • 223
  • xmllint is also available out of the box on several other operating systems, as well as a part of the Anaconda Python distribution, etc. etc. – Ambidextrous Sep 16 '14 at 23:30
  • FYI, for anyone who finds there way here, if you're attempting to pipe stoud to `xmlint` from another command, or use redirect ` – DryLabRebel Mar 27 '23 at 05:21
1

Use this online tool: http://www.freeformatter.com/xml-formatter.html.

I use it daily and it work's just fine.

kamituel
  • 34,606
  • 6
  • 81
  • 98
  • 1
    I wouldn't recommend an online tool for two reasons. 1. This cannot be very easily automated. 2. Depending on how it's implemented, the XML code may pass temporarily to another server, which is unwise from a privacy/security perspective. – mellamokb Mar 12 '13 at 14:15
  • True for both reasons - if you care about them. Sometimes you don't and then it's faster with online formatter than with command line stuff, when you have to install it first. – kamituel Mar 12 '13 at 14:19
  • The bug in the XML single string input means it wont parse in. I need something that attempts the reformatting and then huffs out, so that I can see where the tags are malformed. It's a tricky question I know. – Steve Hibbert Mar 12 '13 at 14:20
1

Colleague found it - Visual Studio has an Edit.Advanced.FormatDocument option that will have a stab at formatting trashed XML. It has got me going, finally.

Thanks all for contributions.

Steve Hibbert
  • 2,045
  • 4
  • 30
  • 49
0

use "Xml formatter" package in Atom text editor. https://atom.io/packages/xml-formatter

Its offline and you don't risk data privacy

Aman
  • 15
  • 2
  • 5