0

i would like to extract some string from XML file. my XML file as below:-

<PartNumber name="750">
    <SubComponent name="FPGA">
        <SubComponentItem name="0" device_name="golden" desc="GPCAM FPGA Golden Image" rev="0x002a0023" type="FPGA_T6E_PIC" cache="yes" />
        <SubComponentItem name="1" device_name="user"   desc="GPCAM FPGA User Image"   rev="0x002a0023" type="FPGA_T6E_PIC" cache="yes" />
    </SubComponent>
    <SubComponent name="LTC">
        <SubComponentItem name="0" desc="ltc3880-1.0v-0" rev="0x0003" type="PMBUS_T6E_QSFP28" device_name="ltc3880-1.0v" index="0xb4" />
        <SubComponentItem name="1" desc="ltc3880-3.3v"   rev="0x0003" type="PMBUS_T6E_QSFP28" device_name="ltc3880-3.3v" index="0xb4" />
    </SubComponent>
    <SubComponent name="EEPROM">
        <SubComponentItem name="0"  desc="BCM8238X Retimer 0 ver"       device_name="SLOT_NUMBER/%SLOT_NUMBER/0"  rev="D00E"      type="BCM8238X_EEPROM" cache="yes" />
        <SubComponentItem name="1"  desc="BCM8238X Retimer 0 checksum"  device_name="SLOT_NUMBER/%SLOT_NUMBER/0"  checksum="600D" type="BCM8238X_EEPROM" cache="yes" />
        <SubComponentItem name="2"  desc="BCM8238X Retimer 1 ver"       device_name="SLOT_NUMBER/%SLOT_NUMBER/0"  rev="D00E"      type="BCM8238X_EEPROM" cache="yes" />
        <SubComponentItem name="3"  desc="BCM8238X Retimer 1 checksum"  device_name="SLOT_NUMBER/%SLOT_NUMBER/0"  checksum="600D" type="BCM8238X_EEPROM" cache="yes" />
    </SubComponent>
</PartNumber>

for example i want to extract the rev value in PartNumber name =750 and inside SubComponentItem name=FPGA. how can i extract it?and store it.

i had tried below code but still encountered some error, below is my code:-

  use strict;
  use warnings;
  use XML::Simple;
  use Data::Dumper;

  my $simple = XML::Simple->new();
  my $data = $simple->XMLin('/cy/programable/1ProgrammableRevision.xml');

  print Dumper($data) . "\n";

  print $data->{PartNumber}->{750}->{FPGA}->{0}->{rev}->[1];

for you information, my perl version 5.8.8, and XML::libxml or XML::Twig are not applicable.

Sobrique
  • 52,974
  • 7
  • 60
  • 101
Will Lee
  • 13
  • 5
  • 3
    Why would you install one library (`XML::Simple` isn't core) but not others, when they're demonstrably better? – Sobrique Sep 20 '17 at 09:02
  • 1
    XML::Simple is Nasty.. rater use Twig. – Gerhard Sep 20 '17 at 09:06
  • 1
    I like `XML::Twig` for being easy to get started with. I like `XML::LibXML` for being fully featured and powerful. – Sobrique Sep 20 '17 at 09:08
  • i not sure why, i just join the company and they are not give me the right or permission to install other library – Will Lee Sep 20 '17 at 09:08
  • 2
    I don't understand why anyone would use XML::Simple after reading [the warnings in the documentation](https://metacpan.org/pod/XML::Simple#STATUS-OF-THIS-MODULE). – Dave Cross Sep 20 '17 at 09:09
  • 2
    You can install modules locally. But if you can't, read through ["Why is XML::Simple discouraged"](https://stackoverflow.com/questions/33267765/why-is-xmlsimple-discouraged) and get the alternatives installed. – Sobrique Sep 20 '17 at 09:09
  • 1
    Also: `perl 5.8.8` is _old_. 9 years old to be specific. If isolation is really your problem, then things like Docker can build a `perl` container quite handily. – Sobrique Sep 20 '17 at 09:11
  • 1
    Two important questions to ask when being interviewed for a Perl job. "Which version of Perl do you use?" and "How do developers get new CPAN modules included in the project?" If the answers aren't satisfactory, just don't take t he job. – Dave Cross Sep 20 '17 at 09:12

2 Answers2

6

Don't use XML::Simple - this task is much easier using xpath, and for that you need XML::LibXML or XML::Twig.

#!/usr/bin/perl

use strict;
use warnings;

use XML::Twig;

my $twig = XML::Twig -> parsefile ( '/cy/programable/1ProgrammableRevision.xml'); 

my $value = $twig -> get_xpath('//PartNumber[@name="750"]/SubComponent[@name="FPGA"]/SubComponentItem[@device_name="user"]',0 ) -> att('rev');

print $value;

One of the niceties of xpath is you can to partial paths - lets say you know you're looking for a "FPGA_T6E_PIC":

my $value = $twig -> get_xpath('//SubComponentItem[@type="FPGA_T6E_PIC"]',0 ) -> att('rev');
print $value;
Sobrique
  • 52,974
  • 7
  • 60
  • 101
-3

With a simple regular expression ?

my $name='';

if (/<PartNumber ([^>]+)>/) {
    my $PN_attr=$1;
    if ($PN_attr =~ /name="([^"]*)"/) {
        $name=$1;
    }
}

print $name;

the [^>]+ scheme is there not to match end delimiter inside a < … > block.

  • 1
    https://stackoverflow.com/questions/6751105/why-its-not-possible-to-use-regex-to-parse-html-xml-a-formal-explanation-in-la – Dave Cross Sep 20 '17 at 09:26