15

How can I validate a large XML file (>100mb)? I try to open it with IE, FX & GC and it either crashes or doesn't do anything.

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
user363637
  • 151
  • 1
  • 1
  • 4
  • possible duplicate of [Text editor to open big (giant, huge, large) text files](http://stackoverflow.com/questions/159521/text-editor-to-open-big-giant-huge-large-text-files) – Jehof Sep 23 '11 at 11:33
  • 1
    @Jehof Nope, not at all. A text editor is different from a validator. – phihag Sep 23 '11 at 13:19
  • @phihag to be fair, the OP *did* talk about opening the file in IE and Firefox, so it's unclear whether the OP means to visually manually validate or programatically validate. – dj_segfault Sep 23 '11 at 14:13

12 Answers12

13

xmllint --stream

Worked on a 1.2Gb file with memory limited to 500Mb:

ulimit -Sv 500000
xmllint --stream a.xml

Without --stream, Linux kills the process, and without ulimit, my computer jams.

I was not able however to get output from --xpath when using --stream: How to do command line XPath queries in huge XML files?

Tested in Ubuntu 14.04, xmllint version 20901.

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
  • 1
    It's worth nothing that `xmllint` is cross platform. I use it on Windows. I confirm that the `--stream` option works there well too. I didn't even need to set a memory limit to process 3.5GB. However .net library seems to be 2x faster. – Jarekczek Mar 11 '16 at 07:02
  • 1
    @Jarekczek thanks for letting me know! You don't need the `ulimit` in Linux either with `--stream`, I'm just showing people how not to brick their machines / test that it actually does not use much memory ;-) – Ciro Santilli OurBigBook.com Mar 11 '16 at 07:13
  • 1
    To maybe help with Google searches: this is the answer you need if running xmllint just stops with "Killed" when validation a huge file. – Legolas Jul 15 '22 at 14:20
8

You can try using a command-line validator, for example xmlstarlet:

$ xmlstarlet validate bigfile.xml
phihag
  • 278,196
  • 72
  • 453
  • 469
6

The only tool I know that combines a large file viewer and an XML validator for huge files is XML ValidatorBuddy . The file viewer doesn't load the complete file at once but it is possible to scroll and there is also XML syntax-coloring applied. The validation uses the SAX parser from Xerces and your document with >100mb shouldn't be a problem.

Clemens
  • 1,744
  • 11
  • 20
lichtfusion
  • 106
  • 1
  • 3
3

Oxygen XML has a HUGE FILE support that does validation

http://www.oxygenxml.com/#14.1Huge_XML_Files_Support

innovimax
  • 440
  • 5
  • 8
2

You can also use the XML Tools Plugin in Nodepad++, it has a function "Check XML Syntax now". It's simple to download and if you don't use Notepad++ already, it's a good reason to start!

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
Mariaki
  • 31
  • 1
1

The following command worked for me xmllint --huge

0

Windows Version of XML Starlet:

> xml val <xmlfile.xml>
0

Liquid Studio Community Edition contains a Large File Editor which can also be used to validate XML files. Its not really got an upper limit on the size of the files you can open Terra-byte files open instantly on low spec machines, and its free.

Liquid Studio Large File Editor

Sprotty
  • 5,676
  • 3
  • 33
  • 52
0

In Java, and I'm sure in other languages, there are solutions for reading in an entire XML file and processing it as a complete DOM, and solutions that process the XML as a stream in an event-driven way. You would want the second kind of solution, which never loads the entire file in memory. See SAX for a Java solution to the problem.

dj_segfault
  • 11,957
  • 4
  • 29
  • 37
0

You could try the EditiX XML editor.

If you load your document into EditiX and there are problems with the XML, eg. mismatched opening and closing tags, the editor will still load the file and in the bottom right corner of the screen you'll see a number displayed in red eg. a red "5" means there are five errors in the document.

I've not tried a 100mb document but I've done over 15mb and it seemed quite happy.

There's a free version.

Nigel Alderton
  • 2,265
  • 2
  • 24
  • 55
0

in addition to dj_segfault's comment on phihag's answer, xmlstarlet is fortunately NOT dead. They've just released Version 1.3

If you want a decent commandlinetool that can manipulate xml, xmlstarlet is perfect (and pretty fast).

raincrumb
  • 409
  • 4
  • 5
-1

On Windows you can write a simple application based on .net platform. The System.Xml.XmlReader class is capable of validating huge files. An example is in this answer: Validating an XML against referenced XSD in C#

Community
  • 1
  • 1
Jarekczek
  • 7,456
  • 3
  • 46
  • 66
  • Contextually, a "write your own xml validating program in .NET" answer doesn't make much sense when there are already programs for all OSes to do this. – pydsigner Mar 10 '16 at 23:05
  • Packaging a third party application along with your program is troublesome. Using a built-in operating system library, what I suggest, is much easier. Since stackoverflow is for programmers, I hope such suggestions will always come up. -1 doesn't worry me, but please don't make this answer invisible. – Jarekczek Mar 11 '16 at 06:56
  • if your first thought was to validate XML in a web browser, as the OP's was, you're almost certainly not trying to ship a solution with a program. Your answer doesn't apply well to the situation. – pydsigner Mar 11 '16 at 07:20