0

I've been scouring the internet for 2 days now trying to find answers to how to proper reference a multi-level XML file using Perl XML Parsers. I'm a novice Perl guy and this is my first post to this forum, so I have much to learn. I'm starting with XML::Simple. I realize that some have preferences for other libraries.

XML Sample file:

<events>
    <event>
        <EventObject>Application</EventObject>
        <EventType>Start</EventType>
        <Operation></Operation>
        <EventTimestamp>Sat 11/21/2015-14:02:57.76</EventTimestamp>
    </event>
    <source>
        <UserIPAddr>192.168.1.2</UserIPAddr>
        <UserHostName>ABC-PROD-BAR-15-01A</UserHostName>
        <UserUUID>EC2-User</UserUUID>
    </source>
    <target>
        <URL>"https://foo.com/"</URL>
    </target>
    <payload>
        <FormData></FormData>
        <PackageFilename></PackageFilename>
    </payload>

    <event>
        <EventObject>User</EventObject>
        <EventType>Download</EventType>
        <Operation>Acknowledge License</Operation>
        <EventTimestamp>Sat 11/21/2015-14:03:10.44</EventTimestamp>
    </event>
    <source>
        <UserIPAddr>10.120.30.4</UserIPAddr>
        <UserHostName>WSM24CN502</UserHostName>
        <UserUUID>simpson homer 750329 </UserUUID>
    </source>
    <target>
        <URL>"https://dev.catalog.com/"</URL>
    </target>
    <payload>
        <FormData></FormData>
        <PackageFilename>"eclipse.luna.5.2.tag.gz"</PackageFilename>
    </payload>
</events>

Sample Code:

#!perl

# use module
use XML::Simple;
use Data::Dumper;
use XML::Parser;

# create object
$xml = new XML::Simple (KeyAttr=>[]);

# read XML file
my $data = $xml->XMLin("auditfile3.xml",forcearray=>1);
#$data = $xml->XMLin("auditfile3.xml",KeyAttr=>{EventRecord=>'Event'});
print Dumper($data);

#print $data->{Events}->{Event};

#my $EventRecord = $data->{EventRecord};
#print Dumper($EventRecord);

#print $EventRecord->{EventObject};
#print $data->{EventObject};

# dereference hash ref
# access <EventRecord> array

foreach my $e (@{$data->{Event}})
    {
     print "EventObject: ",$e->{Event->{EventObject}}, "\n";
     print "EventType:  ", $e->{EventType}, "\n"; 
     print "Operation: ", $e->{Operation}, "\n";
     print "Timestamp: ", $e->{EventTimestamp}, "\n";
    }
bbboomer
  • 27
  • 1
  • 1
  • 6
  • 3
    When people recommend against using XML::Simple it's because they have suffered pain and they know it will cause you pain. [Get off it](http://www.perlmonks.org/index.pl?node_id=490846) sooner rather than later :-) – Grant McLean Nov 23 '15 at 01:52
  • [Why is XML::Simple "discouraged"](http://stackoverflow.com/questions/33267765/why-is-xmlsimple-discouraged) – Sobrique Nov 23 '15 at 17:22

2 Answers2

2

Use XML::LibXML

#!/usr/bin/env perl

use strict;
use warnings;
use feature qw(say);

use XML::LibXML;

my $xml = XML::LibXML->load_xml( IO => \*DATA );

for my $node ( $xml->findnodes('//event') ) {
    for my $property (qw(EventObject EventType Operation EventTimestamp)) {
        next unless my ($child) = $node->findnodes($property);
        say "$property: ", $child->textContent();
    }

    say '';
}

__DATA__
<events>
    <event>
        <EventObject>Application</EventObject>
        <EventType>Start</EventType>
        <Operation></Operation>
        <EventTimestamp>Sat 11/21/2015-14:02:57.76</EventTimestamp>
    </event>
    <source>
        <UserIPAddr>192.168.1.2</UserIPAddr>
        <UserHostName>ABC-PROD-BAR-15-01A</UserHostName>
        <UserUUID>EC2-User</UserUUID>
    </source>
    <target>
        <URL>"https://foo.com/"</URL>
    </target>
    <payload>
        <FormData></FormData>
        <PackageFilename></PackageFilename>
    </payload>

    <event>
        <EventObject>User</EventObject>
        <EventType>Download</EventType>
        <Operation>Acknowledge License</Operation>
        <EventTimestamp>Sat 11/21/2015-14:03:10.44</EventTimestamp>
    </event>
    <source>
        <UserIPAddr>10.120.30.4</UserIPAddr>
        <UserHostName>WSM24CN502</UserHostName>
        <UserUUID>simpson homer 750329 </UserUUID>
    </source>
    <target>
        <URL>"https://dev.catalog.com/"</URL>
    </target>
    <payload>
        <FormData></FormData>
        <PackageFilename>"eclipse.luna.5.2.tag.gz"</PackageFilename>
    </payload>
</events>

Outputs:

EventObject: Application
EventType: Start
Operation:
EventTimestamp: Sat 11/21/2015-14:02:57.76

EventObject: User
EventType: Download
Operation: Acknowledge License
EventTimestamp: Sat 11/21/2015-14:03:10.44
Miller
  • 34,962
  • 4
  • 39
  • 60
  • 1
    Thank you for your response and great example solution. Since the problem had been resolved, I never looked for additional contributions. Much appreciated. – bbboomer Oct 03 '16 at 19:34
0

XML element names are case sensitive. Also, you have some syntax errors in the code.

my $xml = 'XML::Simple'->new(KeyAttr => [], ForceArray => 1);
my $data = $xml->XMLin(...);

for my $e (@{ $data->{event} }) {
    print "EventObject: ", $e->{EventObject}[0], "\n";
    print "EventType: ", $e->{EventType}[0], "\n";
    print "Operation: ", ref $e->{Operation}[0] ? '-empty-'
                                                : $e->{Operation}[0], "\n";
    print "Timestamp: ", $e->{EventTimestamp}[0], "\n";
}
choroba
  • 231,213
  • 25
  • 204
  • 289
  • Thank you Choroba. Solution is most humbling. – bbboomer Nov 23 '15 at 01:43
  • why is it necessary to reference array [0]? And what is the difference between 'for' and 'foreach' in this case? I thought that between the 'foreach' and @ that it picked up each item in the array created by XMLin? – bbboomer Nov 23 '15 at 01:55
  • @bbboomer: `ForceArray` creates array references everywhere, so you have to use `[0]` to get their first element. `for` and `foreach` are the same command, but one can type `for` faster. – choroba Nov 23 '15 at 01:59
  • Again, thank you for a quick response to my questions and resolution. – bbboomer Nov 23 '15 at 02:01