1
@files = glob "*.xml";

undef $/;
for $file (@files) {
    $indent = 0;
    open FILE, $file or die "Couldn't open $file for reading: $!";
    $_ = readline *FILE;
    close FILE or die "Couldn't close $file: $!";

    # Remove whitespace between > and < if that is the only thing separating them
    s/(?<=>)\s+(?=<)//g;

    # Indent
    s{  # Capture a tag <$1$2$3>,
        # a potential closing slash $1
        # the contents $2
        # a potential closing slash $3
        <(/?)([^/>]+)(/?)> 

        # Optional white space
        \s*

        # Optional tag.
        # $4 contains either undef, "<" or "</"
        (?=(</?))?
    }
    {
        # Adjust the indentation level.
        # $3: A <foo/> tag. No alteration to indentation.
        # $1: A closing </foo> tag. Drop one indentation level
        # else: An opening <foo> tag. Increase one indentation level
        $indent +=
            $3 ?  0 :
            $1 ? -1 :
                  1;

        # Put the captured tag back into place
        "<$1$2$3>" .
        # Two closing tags in a row. Add a newline and indent the next line
        ($1 and ($4 eq "</") ?
            "\n" . ("  " x $indent) : 
        # This isn't a closing tag but the next tag is. Add a newline and
        # indent the next line.
        $4 ?
            "\n" . ("  " x $indent) :
        # This isn't a closing tag - no special indentation. I forget why
        # this works.
            ""
        )
    # /g repeat as necessary
    # /e Execute the block of perl code to create replacement text
    # /x Allow whitespace and comments in the regex
    }gex;

    open FILE, ">", $file or die "Couldn't open $file for writing: $!";
    print FILE or die "Couldn't write to $file: $!";
    close FILE or die "Couldn't close $file: $!";
}

I'm using this code to indent a bunch of xml files correctly. However, when I execute I get:

Use of uninitialized value $4 in string eq at C:/Users/souzamor/workspace/Parser/xmlreformat.pl line 25.

and line 25 is:

# $4 contains either undef, "<" or "</"

I don't know why is it, and I'm new to Perl. Could someone please help me?

zostay
  • 3,985
  • 21
  • 30
cybertextron
  • 10,547
  • 28
  • 104
  • 208

4 Answers4

4

The $4 refers to the fourth capturing parenthesis in your regular expression, in this case: (?=(</?))?. As the comment states this may be undefined because of the ? at the very end which means "this thing may be there, but it also might not be".

If you use an undefined value (signalled via the special value undef in Perl) in certain ways, including in a string comparison with eq, you get a warning from Perl. You can easily check whether or not a variable is defined with defined($var).

In your particular case $4 is used in this phrase:

($1 and ($4 eq "</") ? "\n" . ("  " x $indent) : 
 $4                  ? "\n" . ("  " x $indent) :
                       ""

Fixing the warning is as easy as replacing those tests with this:

($1 and defined($4) and ($4 eq "</") ? "\n" . ("  " x $indent) : 
$4                                   ? "\n" . ("  " x $indent) :
                                       ""

Note that you don't have to check for defined($4) in the second line in this particular case, but it wouldn't hurt either.

Moritz Bunkus
  • 11,592
  • 3
  • 37
  • 49
0

Unless there's no final match there:

(?=(</?))?

If that final question mark allows the match to proceed to replacement, then $4 will be undef. For example (using Perl 5.10 or better, for older it should be safe to use || instead of //):

(($4 // '') eq "</")

You'll just have to guard against that or turn off the warnings. You can't move the capture outside the zero-width look ahead because that will always set $4 to an empty string.

zostay
  • 3,985
  • 21
  • 30
0

So this run-time error, is telling you that given your current input, $4 has no value, but you're accessing it anyway.

So the lines:

 # Optional tag.
 # $4 contains either undef, "<" or "</"

Are lying. If $4 was undef, you'd be getting a complaint about an undefined value rather than an uninitialized value.

$4 is not matching anything at the time you execute this s{}{} statement.

Unless you MUST write an XML pretty-printer, you should get one from the CPAN.

Len Jaffe
  • 3,442
  • 1
  • 21
  • 28
-2

If it is works correctly then you could ignore warnings. Change this line

close FILE or die "Couldn't close $file: $!";

to

 close FILE or die "Couldn't close $file: $!";
 no warnings 'uninitalized';

But it would better/nicer to use some xml parser library to parse xml...

Regards,

user1126070
  • 5,059
  • 1
  • 16
  • 15