0

Perl beginner here, i have a script which makes api call, collects feedback in xml format, then using XML::Simple, massages data into below data structure, im trying to shoot for the following output:

filename1.req, UserFaulted,123

filename2.req, UserFaulted,321

Data Structure:

$VAR1 = {
      'xmlns:i' => 'http://www.w3.org/2001/XMLSchema-instance',
      'xmlns' => 'http://example.com',
      'UserRequest' => {
                       'i1' => {
                               'Id' => 'e012',
                               'Dependencies' => [
                                                 {}
                                               ],
                               'xmlns:z' => 'http://schemas.microsoft.com/2003/10/Serialization/',
                               'IdentityUserNumber' => '123',
                               'Stage' => 'UserFaulted',
                               'StartTimestamp' => '2016-04-29T00:05:11',
                               'HomeFileName' => 'filename1.req',
                               'UseBypass' => 'false'
                             },
                       'i2' => {
                               'Id' => 'e013',
                               'Dependencies' => [
                                                 {}
                                               ],
                               'xmlns:z' => 'http://schemas.microsoft.com/2003/10/Serialization/',
                               'IdentityUserNumber' => '321',
                               'Stage' => 'UserFaulted',
                               'StartTimestamp' => '2016-04-19T19:50:18',
                               'HomeFileName' => 'filename2.req',
                               'UseBypass' => 'false'
                             }

                     }
    };

Here is what i have so far, at this point im starting to think i shot myself in the foot, but any feedback or suggestions would be greatly appreciated

#!/usr/bin/perl
use strict;
use warnings;
use XML::Simple qw(:strict);
use Data::Dumper;



my $time = "2016-04-19";

my $api_faultedreqs = `curl -x 111.222.333.444:8080 -U user:pass -H "Accept: application/xml" -H "Content-Type: application/xml" "https://example.com" 2>/dev/null`;



my $xml_fault_reqs = XMLin($api_faultedreqs, KeyAttr => { UserRequest => 'Id' }, ForceArray => [ 'UserRequest', 'Dependencies' ]);
my %xml_fault_reqs = %$xml_fault_reqs;
my %clean_out = ();


print Dumper($xml_fault_reqs);

#print $xml_fault_reqs->{UserRequest}->{i1}->{HomeFileName};

for my $outer_key (keys %xml_fault_reqs){
        next if $outer_key =~/xmlns/;
                for my $req_ids2 (keys %{ $xml_fault_reqs{$outer_key} }){
                        for my $req_data (keys %{ $xml_fault_reqs{$outer_key}{$req_ids2} }){
                                next if $req_data =~/xmlns/ or $req_data =~/Dependencies/ or $req_data =~/UseBypass/ or $req_data =~/EndTimestamp/;
                                #print "$req_data, $xml_fault_reqs{$outer_key}{$req_ids2}{$req_data}\n";
                                print "$xml_fault_reqs{$outer_key}{$req_ids2}{HomeFileName}, $xml_fault_reqs{$outer_key}{$req_ids2}{Stage}\n";
                        }
                }
}

XML output as requested:

<ArrayOfUserRequest xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://example.com">
<UserRequest xmlns:z="http://schemas.microsoft.com/2003/10/Serialization/" z:Id="i1">
    <Dependencies/>
    <HomeFileName>filename1.req</HomeFileName>
    <IdentityUserNumber>123</IdentityUserNumber>
    <Stage>UserFaulted</Stage>
    <StartTimestamp>2016-04-29T00:05:11</StartTimestamp>
    <UseBypass>false</UseBypass>
</UserRequest>
<UserRequest xmlns:z="http://schemas.microsoft.com/2003/10/Serialization/" z:Id="i2">
    <Dependencies/>
    <HomeFileName>filename2.req</HomeFileName>
    <IdentityUserNumber>321</IdentityUserNumber>
    <Stage>UserFaulted</Stage>
    <StartTimestamp>2016-04-20T15:44:51</StartTimestamp>
<UseBypass>false</UseBypass>
</UserRequest>

  • Welcome to StackOverflow and the Perl tag. Please [edit] your question and show the actual input XML. Since XML::Simple is discouraged (and says so itself), you will likely receive concise answers that do what you want in a lot less code, but with a different XML module. For that, we need the source XML. – simbabque Apr 29 '16 at 15:10
  • Thanks simbabque, yes i saw that warning in the documentation too, live and learn i guess :) – MkDwonderer Apr 29 '16 at 15:40

3 Answers3

1

As simbabque intimated in his comment, XML::Simple is generally frowned upon for a number of reasons. You may want to read the Stack Overflow question Why is XML::Simple "Discouraged"? to understand better why that is

However you immediate problem is how to navigate a fairly ordinary Perl nested data structure, and you will find a useful tutorial on that in perldoc perlreftut

Here's a simple solution to your problem. The items of interest are the values of the second-level hash that has UserRequest, so this program iterates over those and prints the required fields from each of them

The printf uses a hash slice to access all three fields at once, with keys HomeFileName, Stage, and IdentityUserNumber. The printf format displays all three on a single line in the format that you have asked for

use strict;
use warnings 'all';

use XML::Simple;

# my $data = XMLin(...);

my $data = {
    'xmlns:i'     => 'http://www.w3.org/2001/XMLSchema-instance',
    'xmlns'       => 'http://example.com',
    'UserRequest' => {
        'i1' => {
            'Id'                 => 'e012',
            'Dependencies'       => [ {} ],
            'xmlns:z'            => 'http://schemas.microsoft.com/2003/10/Serialization/',
            'IdentityUserNumber' => '123',
            'Stage'              => 'UserFaulted',
            'StartTimestamp'     => '2016-04-29T00:05:11',
            'HomeFileName'       => 'filename1.req',
            'UseBypass'          => 'false'
        },
        'i2' => {
            'Id'                 => 'e013',
            'Dependencies'       => [ {} ],
            'xmlns:z'            => 'http://schemas.microsoft.com/2003/10/Serialization/',
            'IdentityUserNumber' => '321',
            'Stage'              => 'UserFaulted',
            'StartTimestamp'     => '2016-04-19T19:50:18',
            'HomeFileName'       => 'filename2.req',
            'UseBypass'          => 'false'
        }
    }
};

for my $request ( values %{ $data->{UserRequest} } ) {
    printf "%s, %s,%s\n", @{$request}{qw/ HomeFileName  Stage  IdentityUserNumber  /};
}

output

filename1.req, UserFaulted,123
filename2.req, UserFaulted,321
Community
  • 1
  • 1
Borodin
  • 126,100
  • 9
  • 70
  • 144
  • @simbabque: It's there because the OP has explained that `$data` is actually the result of an `XMLin` call in the real code, rather than a data literal which isn't of much practical use. So it *will* be necessary in the OP's final solution – Borodin Apr 29 '16 at 16:01
  • With your commented explanation it of course makes sense. – simbabque Apr 29 '16 at 16:03
1

Thank you for showing the XML data. It helps a lot

Here's a solution that uses the XML::LibXML module to parse the data. It's a little more complicated than it could be because your data uses the default namespace with xmlns="http://example.com". That namespace must be defined and used explicitly in an XPath expression, which means you also need to create an XPath context object using the XML::LibXML::XPathContext module. That allows you to register namespaces and use them in your XPath expressions. Even the default namespace must have a name, so I've called it nul, and prefixed every node name with nul:

The code is quite simple. It uses findnodes to locate all UserRequest nodes, and pulls the values of the HomeFileName, Stage, and IdentityUserNumber children from each of them, printing the results with a printf call

use strict;
use warnings 'all';
use feature 'say';

use XML::LibXML;

my $dom = XML::LibXML->load_xml(IO => \*DATA);
my $xpc = XML::LibXML::XPathContext->new($dom);
$xpc->registerNs(nul => 'http://example.com');

for my $request ( $xpc->findnodes('/nul:ArrayOfUserRequest/nul:UserRequest') ) {

    printf "%s, %s,%s\n",
            $xpc->findvalue('nul:HomeFileName', $request),
            $xpc->findvalue('nul:Stage', $request),
            $xpc->findvalue('nul:IdentityUserNumber', $request);
}

__DATA__
<ArrayOfUserRequest xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://example.com">
<UserRequest xmlns:z="http://schemas.microsoft.com/2003/10/Serialization/" z:Id="i1">
    <Dependencies/>
    <HomeFileName>filename1.req</HomeFileName>
    <IdentityUserNumber>123</IdentityUserNumber>
    <Stage>UserFaulted</Stage>
    <StartTimestamp>2016-04-29T00:05:11</StartTimestamp>
    <UseBypass>false</UseBypass>
</UserRequest>
<UserRequest xmlns:z="http://schemas.microsoft.com/2003/10/Serialization/" z:Id="i2">
    <Dependencies/>
    <HomeFileName>filename2.req</HomeFileName>
    <IdentityUserNumber>321</IdentityUserNumber>
    <Stage>UserFaulted</Stage>
    <StartTimestamp>2016-04-20T15:44:51</StartTimestamp>
<UseBypass>false</UseBypass>
</UserRequest>
</ArrayOfUserRequest>

output

filename1.req, UserFaulted,123
filename2.req, UserFaulted,321
Borodin
  • 126,100
  • 9
  • 70
  • 144
0

It looks like you're just trying to print the elements HomeFileName and Stage.

That being so, using something like XML::Twig lets you:

use XML::Twig;

my @things = qw ( HomeFileName Stage IdentityUserNumber ); 

my $twig = XML::Twig->parse($api_faultedreqs);

foreach my $user_request ( $twig->get_xpath('//UserRequest') ) {
    print join ",", (map { $user_request -> first_child_text($_) } @things), "\n";
}
Sobrique
  • 52,974
  • 7
  • 60
  • 101
  • 1
    thank you for letting me know about this gem (XML::Twig) just used this in another project! Saved so much time!!! Thanks Again :) – MkDwonderer Aug 11 '16 at 20:56