I am doing a perl script which will do some formatting to an xml file. I need some help when it comes to ignoring white space before the opening of any xml tag. I have the following xml file
test.xml
<xml>
<TI>Definitions, Exemptions and Rebates "where"
<VARPARA><VAR>E</VAR></VARPARA></TI>
</xml>
I want a regex expression which will replace any whitespaces including extra spaces and new line characters before the opening of any xml tag with a single space, so in the above case <VARPARA>
is the tag which has some white spaces and new line character after "where".
I was thinking something along the lines of
$s =~ s/\s*</ </ig;
but here it will look at the opening tag <
only, whereas I want to check both the opening <
and closing tag >
as well so
<VARPARA>
.
The output string should look like below
<xml>
<TI>Definitions, Exemptions and Rebates "where" <VARPARA><VAR>E</VAR></VARPARA></TI>
</xml>