Please don't use regular expression to manipulate XML. XML is a contextual language. Regex isn't, so it can NEVER work properly. At best, you have a dirty hack, that will one day break for no discernible reason, because it's making assumptions that aren't valid.
Please use a parser. It's not hard, but does mean you avoid creating brittle code.
Longhand in perl
, it's:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $input = "String= Hello World";
my ($tag, $content) = split /=/, $input;
XML::Twig::Elt -> new ( $tag, $content ) -> print;
This outputs:
<String> Hello World</String>
As a more extensive example:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $input = "";
my ( $tag, $content ) = split /=/, $input;
my $doc = XML::Twig->new( pretty_print => 'indented_a' ) ;
$doc->set_xml_version("1.0");
$doc->set_encoding('utf-8');
$doc->set_root( XML::Twig::Elt->new('root') );
while (<>) {
chomp;
my ( $tag, $content ) = split /=/;
if ( $content =~ m/^B/ ) {
$doc->root->insert_new_elt( 'last_child', $tag, $content );
}
}
$doc->print;
Input of:
String= Hello World
tag=B1234 some text here
newtag=fish heads fish heads roly poly fish heads
String=Better fun joy here
Gives a result of:
<?xml version="1.0" encoding="utf-8"?>
<root>
<String>Better fun joy here</String>
<tag>B1234 some text here</tag>
</root>
It's not too hard to use a proper parser, and if you need more reason to do so: RegEx match open tags except XHTML self-contained tags