-1

I have an XML File in below format:

<installer>
<Plugins>
.
.
.
</Plugins>
</installer>
<installer>
<Plugins>
.
.
.
</Plugins>
</installer> 

So, as you can see there are 2 parts here i.e. 2 installer blocks. I want to segregate and redirect the first installers part to Test1.xml file and second installers part to Test2.xml.

I know to achieve the same using a for loop. But, please provide me any solution using sed/awk for faster processing.

Shankar Guru
  • 1,071
  • 2
  • 26
  • 46
  • 1
    You may be interested in http://search.cpan.org/dist/XML-Twig/tools/xml_split/xml_split – Wintermute May 11 '15 at 19:45
  • This may also be interesting to you as far as a starting point goes: http://stackoverflow.com/questions/17988756/how-to-select-lines-between-two-marker-patterns-which-may-occur-multiple-times-w - you'll have to add some variables for which file to print to but shouldn't be too much modification. – zzevannn May 11 '15 at 19:54
  • Is that really what you have, or do you have an XML file like you claim? – ikegami May 11 '15 at 20:16
  • Following on from previous question - http://stackoverflow.com/questions/30171183/read-xml-file-in-perl - it's probably a real XML file. – Sobrique May 11 '15 at 20:50
  • The data you show isn't a well-formed XML document because it has multiple `` elements at the root level. A valid XML document may have only one root element. Is this really what your data looks like? – Borodin May 12 '15 at 00:11

2 Answers2

2

Please please don't use a regular expression or line based approach to splitting XML. That way lies brittle code and broken XML, and that's just bad news for all concerned.

Using the XML you posted in your previous question as a reference point:

Read XML file in perl

#!/usr/bin/perl

use strict;
use warnings;

use XML::Twig;
use Data::Dumper;

my $file_extn = 1;

sub split_installer {
   my ( $twig, $installer ) = @_;
   open ( my $output, ">", "Test".$file_extn++.".xml" ) or warn !;
   print {$ouput} $installer -> sprint();
   close ( $output ); 
}

my $twig = XML::Twig -> new ( twig_handlers => { 'installer' => \&split_installer } ) -> parsefile ( 'your_file.xml );

A lot of this is accomplished by the utility xml_split.

Community
  • 1
  • 1
Sobrique
  • 52,974
  • 7
  • 60
  • 101
0

Unix-like systems come with split. I'd recommend trying that.

% cat test.xml
<installer>
<Plugins>
.
.
.
</Plugins>
</installer>
<installer>
<Plugins>
.
.
.
</Plugins>
</installer>

split -l 7 test.xml test

% cat testaa
<installer>
<Plugins>
.
.
.
</Plugins>
</installer>
Registered User
  • 255
  • 4
  • 12