-3

I have an XML file in a string variable ($data), and a hash containing all the tag names and their contents (%field_list).

The tag names are to be checked so that if they are enumerated fields, their contents should be replaced with strings.

Does anybody know if it is possible to do this with search and replace? I am not having much luck at the moment.

foreach my $field (sort keys %field_list)
{ 
    my $value = $field_list{$field};
    # will return a non-empty string if field is enumerated and value is valid
    my $enum_string = &convert_enumeration_to_string($field, $value);
    if ($enum_string ne "")
    {
#syntax error
$data =~ s/<($field)>($value)</($field)>/<($field)>($enum_string)</($field)>/g;
    }
} 

Does anybody know whether there is anything I can do, or do I need a completely different approach?

brian d foy
  • 129,424
  • 31
  • 207
  • 592
Andy
  • 2,770
  • 9
  • 35
  • 42
  • 7
    Not answering your question but offering some advice instead. You really should not use regexps with xml. Use one of the many xml modules found at cpan. Start with XML::Simple – Nifle Nov 18 '09 at 15:41
  • 2
    I just wanted to add weight to this because other people seem to be passively encouraging it by giving you a solution -- parsing xml with regex is retarded: other alternatives you might be interested in would be XML::Twig or XML::LibXML – Evan Carroll Nov 18 '09 at 16:47
  • Please go vote for http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 ;-)) – Sinan Ünür Nov 18 '09 at 16:55
  • If that post gets any more votes it's going to wrap around to -maxint. :) – Ether Nov 18 '09 at 18:21
  • What's with all the negative votes? I suspect we have all tried to parse xml with regexps (and then learnt better) – Nifle Nov 18 '09 at 21:06
  • +1 to offset the (current) -3 – Nifle Nov 18 '09 at 21:07
  • EvanCarroll, thank you for your comment. I don't like this method. Unfortunately XML::* does not seem to be the basic Perl installation, and trying to get 500+ machines in our workplace upgraded so I can use the package is not, should I say, very prospective. – Andy Nov 19 '09 at 17:32

3 Answers3

3

Escape your slashes:

$data =~ s/<($field)>($value)<\/($field)>/<($field)>($enum_string)<\/($field)>/g;

Or use different deliminators:

$data =~ s{<($field)>($value)</($field)>}{<($field)>($enum_string)</($field)>}g;
Jack M.
  • 30,350
  • 7
  • 55
  • 67
  • 4
    I recommend not using `|` as a delimiter because then you can't use `|`-alternation in your pattern. My delimiter of choice is `s{}{}` which usually doesn't interfere with anything. – friedo Nov 18 '09 at 15:39
  • 1
    Thank you very much. So search and replace by variable is possible - the only thing I missed out is forgetting to insert backslashes in front of the forward slashes. Also, the brackets are not needed as it will mess up the output. – Andy Nov 18 '09 at 15:48
  • 1
    I'm at a loss as to why the brackets would change the output, but you're welcome. – Jack M. Nov 18 '09 at 15:55
  • 1
    You can even use `{}` in a `s{}{}` , as long as the`{}` are "nested". `s{{\w}}{}` – Brad Gilbert Nov 18 '09 at 16:06
  • Don't think the OP wants to have parenthesis in the replacement, thought. – Leonardo Herrera Nov 18 '09 at 17:55
1

Well, let's jump into the XML bandwagon: use an XML library like XML::LibXML to manipulate XML documents.

use XML::LibXML;
my $dom = XML::LibXML->load_xml(string => $data);

foreach my $field (sort keys %field_list) {
    my $value = $field_list{$field};
    if (my $enum_string = &convert_enumeration_to_string($field, $value)) {
        foreach my $node ($dom->findnodes("//xml/${field}[. = '$value']")
            ->get_nodelist) {
            my $element = $dom->createElement($field);
            $element->appendText($enum_string);
            $node->replaceNode($element);
        }
    }
}

print $dom->toString;
Leonardo Herrera
  • 8,388
  • 5
  • 36
  • 66
0

The correct way to do it is:

$data =~ s/<($field)>($value)<\/($field)>/<$field>$enum_string<\/$field>/g; 
Andy
  • 2,770
  • 9
  • 35
  • 42
  • Just guessing, but parsing XML with regexes is almost NEVER the correct way to do it. – daotoad Nov 18 '09 at 18:48
  • No need to downvote an answer just because the question was bad. This does answer the question. The other answer got upvoted, and the first part of it is identical to this (except with that parentheses mistake/typo in the replacement). – Cascabel Nov 18 '09 at 19:09