-2

Say how do you parse through this particular XML using perl

A little background: I'm writing a perl script where I separate the XML(datamod) into two XML files.

Example: Existing XML

<Root>
 <Top>
  <Module name="ALU">
   <input name="po_ctrl"/>
   <bidirection name="add"/>
  </Module>
  <Module name="Po_ctrl">
   <input name="ctrl"/>
   <output name="ctrlbus"/>
   <bidirection name="add"/>
  </Module>
  <input name="add"/>
  <input name="clk"/>
  <input name="da_in"/>
  <output name="da_out"/>
  <bidirection name="ctrl"/>
 </Root>
</Top>

Below is the perl snippet written

 open(IN_FILE, "<datamod.xml") or die "Cant open input file";
 open(TM1_FILE, ">tm1.xml") or die "Cant Open tm1.xml";
 open(TM2_FILE, ">tm2.xml") or die "Cant Open tm2.xml"; 
 my $chk = 0;
 while(my $line = <IN_FILE>){
 $line =~ s/^\s+//;
 @xwords = split(" ",$line);
 if($xwords[0] ne "<Module" and $xwords[0] ne "</Module>"  and $chk ==0) {
   print TM1_FILE $line;
  }  
  else {
   print TM2_FILE $line;
   $chk = 1;
  }   
 if($xwords[0] eq "</Module>" and $chk == 1) {
  $chk = 0;
 }  
}
close TM1_FILE;
close TM2_FILE;

Expected output into two temp files

Temp file 1:

   <Root>
      <Top>
       <input name="add"/>
       <input name="clk"/>
       <input name="da_in"/>
       <output name="da_out"/>
       <bidirection name="ctrl"/>
      </Top>
    </Root>

Temp File 2

<Root>
 <Top>
  <Module name="ALU">
   <input name="po_ctrl"/>
   <bidirection name="add"/>
  </Module>
  <Module name="Po_ctrl">
   <input name="ctrl"/>
   <output name="ctrlbus"/>
   <bidirection name="add"/>
  </Module>
</Root>
</Top>

NOTE: I'm using the XML::Simple module because the Perl script is written in it and it's tedious to convert to any other XML module.

Any help is appreciated, kindly post the rewritten snippet!

Community
  • 1
  • 1
Divox
  • 101
  • 6

2 Answers2

1

Since you have not included any code, or how your data is at the moment, I'm going to suggest this simple hack. Just add it as text before you parse the XML.

use strict;
use warnings;

my $xml = <your xml here>;
$xml = "<Root>\n" . $xml . "</Root>\n";
bolav
  • 6,938
  • 2
  • 18
  • 42
1

Don't use regular expressions for XML. XML is a recursive data structure, and whilst you can technically do recursion with regex, it leads to dirty code. So practically you end up with some very selective hackery that'll one day mysteriously break, because a perfectly valid XML change doesn't fit your regex any more.

Also: Don't use XML::Simple for much the same reason. (Despite you saying you are using it in your question, there is no sign of your doing so in the code you have posted).

With a proper parser what you are trying to do becomes very simple. I like XML::Twig, XML::LibXML is probably better, but has a steeper learning curve. Either are less prone to future misery and shoddy code.

What you're trying to do seems to be to split XML, and put modules in one, and "everything else" in another. This is done in XML::Twig like this:

#!/usr/bin/env perl
use strict;
use warnings;

use XML::Twig;

#parse your input
my $twig = XML::Twig->new->parsefile( 'datamod.xml' ); 

#create a new 'modules' document. 
my $modules = XML::Twig->new;
#create a root
$modules->set_root( XML::Twig::Elt->new('Root') );
#create a "Top" element. (You can compound this if you want)
my $top = $modules->root->insert_new_elt('Top');
#set output format (note - this can break in specific edge cases - your XML
#doesn't seem to be one of those). 
$modules->set_pretty_print('indented_a');

#find all the "<Module>" elements. 
foreach my $module ( $twig->findnodes('//Module') ) {
    #cut from old doc
    $module->cut;
    #paste into new. last_child ensures same ordering.
    $module->paste( 'last_child', $top );
}

#print the output to a file.  
open ( my $output, '>', 'tm1.xml' ) or warn $!; 
print {$output} $twig -> sprint; 
close ( $output ); 

open ( my $second_output, '>', 'tm2.xml' ) or warn $!;
print {$second_output} $modules -> sprint; 
close ( $second_output ); 

Note - there's a bit more on assembling a new XML document here: Assembling XML in Perl

You may want to consider setting encoding and versions.

Community
  • 1
  • 1
Sobrique
  • 52,974
  • 7
  • 60
  • 101