I have a 15 GB XML file which I would want to split it .It has approximately 300 Million lines in it . It doesn't have any top nodes which are interdependent .Is there any tool available which readily does this for me ?
10 Answers
QXMLEdit has a dedicated function for that: I used it successfully with a Wikipedia dump. The ~2.7Gio file became a bunch of ~1 400 000 files (one per page). It even allows you to dispatch them in subfolders.

- 121
- 1
- 2
-
I don't know why you were downvoted, this is a very useful, open source tool. – jeffmcneill Apr 01 '17 at 08:50
-
**This should be the accepted answer.** Very useful tool, open source, free to use. – Alastair Campbell Aug 31 '20 at 13:35
-
Some additional details may have helped with voting. Here is their tutorial: https://qxmledit.org/tutorials/splitFiles.pdf – Tim Friesen Nov 25 '20 at 19:26
-
Does the job. No command line though... Or else I am missing it. – Tom Mar 12 '23 at 11:52
XmlSplit - A Command-line Tool That Splits Large XML Files
xml_split - split huge XML documents into smaller chunks
Split that XML by bhayanakmaut (No source code and I could not get this one working)
A similar question: How do I split a large xml file?
-
Error # 16 saying the maximum file size limit exceeded for 1GB file. What is the maximum size it can split? – Masud Rahman Oct 05 '12 at 04:57
Here is a low memory footprint script to do it in the free firstobject XML editor (foxe) using CMarkup file mode. I am not sure what you mean by no interdependent top nodes, or tag checking, but assuming under the root element you have millions of top level elements containing object properties or rows that each need to be kept together as a unit, and you wanted say 1 million per output file, you could do this:
split_xml_15GB() { int nObjectCount = 0, nFileCount = 0; CMarkup xmlInput, xmlOutput; xmlInput.Open( "15GB.xml", MDF_READFILE ); xmlInput.FindElem(); // root str sRootTag = xmlInput.GetTagName(); xmlInput.IntoElem(); while ( xmlInput.FindElem() ) { if ( nObjectCount == 0 ) { ++nFileCount; xmlOutput.Open( "piece" + nFileCount + ".xml", MDF_WRITEFILE ); xmlOutput.AddElem( sRootTag ); xmlOutput.IntoElem(); } xmlOutput.AddSubDoc( xmlInput.GetSubDoc() ); ++nObjectCount; if ( nObjectCount == 1000000 ) { xmlOutput.Close(); nObjectCount = 0; } } if ( nObjectCount ) xmlOutput.Close(); xmlInput.Close(); return nFileCount; }
I posted a youtube video and article about this here:

- 3,074
- 1
- 21
- 13
I think you'll have to split manually unless you are interested in doing it programmatically. Here's a sample that does that, though it doesn't mention the max size of handled XML files. When doing it manually, the first problem that arises is how to open the file itself.
I would recommend a very simple text editor - something like Vim. When handling such large files, it is always useful to turn off all forms of syntax highlighting and/or folding.
Other options worth considering:
EditPadPro - I've never tried it with anything this size, but if it's anything like other JGSoft products, it should work like a breeze. Remember to turn off syntax highlighting.
VEdit - I've used this with files of 1GB in size, works as if it were nothing at all.

- 25,615
- 8
- 56
- 70
-
-
If you're asking about the CodeProject link, I think it inserts Root nodes at the beginning and end of each split file. – Cerebrus Mar 31 '09 at 06:46
-
-
I can vouch for EmEditor's efficiency at editing huge files. Good editor, deserves to be better known; shame the free version was dropped. – bobince Mar 31 '09 at 15:36
-
Thanks, @bobince. I haven't had an opportunity to use it myself but have heard about its effectiveness. – Cerebrus Mar 31 '09 at 15:50
-
-
The open source library comma has several tools to find data in very large XMl files and to split those files into smaller files.
https://github.com/acfr/comma/wiki/XML-Utilities
The tools were built using the expat SAX parser so that they did not fill memory with a DOM tree like xmlstarlet and saxon.

- 2,481
- 3
- 24
- 29
-
-
xmlstarlet and saxon failed for us too so that's why I added the xml tools to comma. – mat_geek Nov 02 '14 at 23:44
Perhaps this question is actual still and I believe it can help somebody. There is an xml editor XiMpLe which contains a tool for splitting big files. Only fragment size is required. And there is also reverse functionality to link xml files together(!). It's free for non-commercial use and the license is not expensive too. No installation is required. For me it worked very good (I had 5GB file).

- 11
- 1
-
Nice, this solution only worked for me OTB with minimum effort. Thx. – majkinetor Sep 15 '21 at 08:52
In what way do you need to split it? It's pretty easy to write code using XmlReader.ReadSubTree
. It will return a new xmlReader instance against the current element and all its child elements. So, move to the first child of the root, call ReadSubtree, write all those nodes, call Read() using the original reader, and loop until done.

- 160,644
- 26
- 247
- 397
Used this for splitting Yahoo Q&A dataset
count = 0
file_count = 1
with open('filepath') as f:
current_file = ""
for line in f:
current_file = current_file + line
if "</your tag to split>" in line:
count = count + 1
if count==50000:
current_file = current_file + "</endTag>"
with open('filepath/Split/file_' +str(file_count)+'.xml' , 'w') as split:
split.write(current_file)
file_count = file_count + 1
current_file = "<?xml version='1.0' encoding='UTF-8'?>\n<endTag>"
count = 0
current_file = current_file + "</endTag>"
with open('filepath/Split/file_' +str(file_count)+'.xml' , 'w') as split:
split.write(current_file)

- 1,542
- 2
- 22
- 35
I used XmlSplit Wizard tool. It really work nicely and you can specify the split method like element, rows, number of files, or the size of files. The only problem is that I had to buy it for 99$ as the trial version wont allow you to all split data, only odd number of divided files. I was able to split a 70GB file !

- 169
- 1
- 8
Not an Xml tool but Ultraedit could probably help, I've used it with 2G files and it didn't mind at all, make sure you turn off the auto-backup feature though.

- 14,657
- 1
- 48
- 81
-
I've added a solution onto the linked item http://stackoverflow.com/questions/4325823/how-do-i-split-a-large-xml-file/ – Steve Black Jun 11 '15 at 02:49
-
Here's the instructions on how to use UE to split a large file http://www.ultraedit.com/support/tutorials_power_tips/ultraedit/split-large-files.html – MrTelly Nov 13 '16 at 01:03