3

I have been using syntax like the following for months, without triggering a warning:

die join('', sprintf(Dumper [@stack]), sprintf(Dumper {%oprAtnNOW}), 'opt tojudge not specified');

That is, I have used sprintf with Dumper, without specifying a format.

In the following code, we see that this works fine, but only up to a point. When %oprAtnNOW contains a long string, a warning is triggered. (In all cases the string compiles as a regex; but before compilation, it is nothing but a string.)

What causes the warning with the long string? Why is there a "missing argument"? Granted, sprintf is supposed to be given a format, as in https://perldoc.perl.org/functions/sprintf. But why is this only enforced when a smaller string is replaced by a long string?

#!/usr/bin/perl
use strict; use warnings;
use Data::Dumper qw(Dumper);
$Data::Dumper::Sortkeys = 1;
print "Perl version: $^V\n";

my %oprAtnNOW; 
my $string='~~~~~1983-10-21 Fri 13:01:13, today we went to the movie.';

%oprAtnNOW = (
    Vv => {
        v=>[ '(?<a>a)',],
    },
);

tryit();

%oprAtnNOW = (
    Vv => {
        v=>[ 
'(?m)^(?<boundjour2009>(?<tilde5>[~]{5})[\\x20\\t]*(?<dateISO1mbeWeekdaymbeTIME>(?<dateISO1mbeWeekday>(?<dateISO1>(?<YYYY>[1-9]\\d\\d\\d)[-](?<nMonth2>0[1-9]|1[0-2])[-](?<nMonthDay2>3[01]|[0-2][0-9]))([\\x20\\t]+(?<wWeekdayAllor3>Sunday|Monday|Tuesday|Wednesday|Thursday|Friday|Saturday|Sun|Mon|Tue|Wed|Thu|Fri|Sat))?)([\\x20\\t]+(?<nTIMEdiverse>(at[\\x20\\t]+)?((?<HHcMMcSS>(?<HH>0[0-9]|1[0-9]|2[0-3]):(?<MM>[0-5][0-9]):(?<SS>[0-5][0-9]))|(?<HHMMmbeSS>(?<HHMM>(?<HH>0[0-9]|1[0-9]|2[0-3])(?<MM>[0-5][0-9]))(?<SS>[0-5][0-9])?)|(?<HHcMM_pct_cSS>(?<HH>0[0-9]|1[0-9]|2[0-3]):(?<MM>[0-5][0-9])%:(?<SS>[0-5][0-9]))|(?<HHcMM_stop>(?<HH>0[0-9]|1[0-9]|2[0-3]):(?<MM>[0-5][0-9])(?![:][0-5][0-9])))))?))',
        ],
    },
);

tryit();

sub tryit
{
    my $rgx=qr/$oprAtnNOW{Vv}->{v}->[0]/;
    if($string=~$rgx)
    {
        print Dumper \%+;
    }
    print "with format:\n";
    print sprintf('%s', Dumper \%oprAtnNOW); 
    print "WITHOUT format:\n";
    print sprintf(Dumper \%oprAtnNOW); 
}

The output:

Perl version: v5.18.4
$VAR1 = {
          'a' => 'a'
        };
with format:
$VAR1 = {
          'Vv' => {
                    'v' => [
                             '(?<a>a)'
                           ]
                  }
        };
WITHOUT format:
$VAR1 = {
          'Vv' => {
                    'v' => [
                             '(?<a>a)'
                           ]
                  }
        };
$VAR1 = {
          'HH' => '13',
          'HHcMMcSS' => '13:01:13',
          'MM' => '01',
          'SS' => '13',
          'YYYY' => '1983',
          'boundjour2009' => '~~~~~1983-10-21 Fri 13:01:13',
          'dateISO1' => '1983-10-21',
          'dateISO1mbeWeekday' => '1983-10-21 Fri',
          'dateISO1mbeWeekdaymbeTIME' => '1983-10-21 Fri 13:01:13',
          'nMonth2' => '10',
          'nMonthDay2' => '21',
          'nTIMEdiverse' => '13:01:13',
          'tilde5' => '~~~~~',
          'wWeekdayAllor3' => 'Fri'
        };
with format:
$VAR1 = {
          'Vv' => {
                    'v' => [
                             '(?m)^(?<boundjour2009>(?<tilde5>[~]{5})[\\x20\\t]*(?<dateISO1mbeWeekdaymbeTIME>(?<dateISO1mbeWeekday>(?<dateISO1>(?<YYYY>[1-9]\\d\\d\\d)[-](?<nMonth2>0[1-9]|1[0-2])[-](?<nMonthDay2>3[01]|[0-2][0-9]))([\\x20\\t]+(?<wWeekdayAllor3>Sunday|Monday|Tuesday|Wednesday|Thursday|Friday|Saturday|Sun|Mon|Tue|Wed|Thu|Fri|Sat))?)([\\x20\\t]+(?<nTIMEdiverse>(at[\\x20\\t]+)?((?<HHcMMcSS>(?<HH>0[0-9]|1[0-9]|2[0-3]):(?<MM>[0-5][0-9]):(?<SS>[0-5][0-9]))|(?<HHMMmbeSS>(?<HHMM>(?<HH>0[0-9]|1[0-9]|2[0-3])(?<MM>[0-5][0-9]))(?<SS>[0-5][0-9])?)|(?<HHcMM_pct_cSS>(?<HH>0[0-9]|1[0-9]|2[0-3]):(?<MM>[0-5][0-9])%:(?<SS>[0-5][0-9]))|(?<HHcMM_stop>(?<HH>0[0-9]|1[0-9]|2[0-3]):(?<MM>[0-5][0-9])(?![:][0-5][0-9])))))?))'
                           ]
                  }
        };
WITHOUT format:
Missing argument in sprintf at /Users/kpr/u/kh/bin/z.pl line 38.
Invalid conversion in sprintf: "%:" at /Users/kpr/u/kh/bin/z.pl line 38.
$VAR1 = {
          'Vv' => {
                    'v' => [
                             '(?m)^(?<boundjour2009>(?<tilde5>[~]{5})[\\x20\\t]*(?<dateISO1mbeWeekdaymbeTIME>(?<dateISO1mbeWeekday>(?<dateISO1>(?<YYYY>[1-9]\\d\\d\\d)[-](?<nMonth2>0[1-9]|1[0-2])[-](?<nMonthDay2>3[01]|[0-2][0-9]))([\\x20\\t]+(?<wWeekdayAllor3>Sunday|Monday|Tuesday|Wednesday|Thursday|Friday|Saturday|Sun|Mon|Tue|Wed|Thu|Fri|Sat))?)([\\x20\\t]+(?<nTIMEdiverse>(at[\\x20\\t]+)?((?<HHcMMcSS>(?<HH>0[0-9]|1[0-9]|2[0-3]):(?<MM>[0-5][0-9]):(?<SS>[0-5][0-9]))|(?<HHMMmbeSS>(?<HHMM>(?<HH>0[0-9]|1[0-9]|2[0-3])(?<MM>[0-5][0-9]))(?<SS>[0-5][0-9])?)|(?<HHcMM_pct_cSS>(?<HH>0[0-9]|1[0-9]|2[0-3]):(?<MM>[0-5][0-9])%:(?<SS>[0-5][0-9]))|(?<HHcMM_stop>(?<HH>0[0-9]|1[0-9]|2[0-3]):(?<MM>[0-5][0-9])(?![:][0-5][0-9])))))?))'
                           ]
                  }
        };

TLP
  • 66,756
  • 10
  • 92
  • 149
Jacob Wegelin
  • 1,304
  • 11
  • 16
  • 3
    Why would you use `sprintf` on something that is already a string, i.e. output from `Dumper`? That seems very unnecessary. – TLP Jul 23 '21 at 14:13

1 Answers1

11

It's not because of the length, but because the long string contains a percent sign.

...(?<MM>[0-5][0-9])%:(?<SS>[0-5][0-9]))...
                    ~

As it's the only argument, it's interpreted as the format.

You can demonstrate the same behaviour with much shorter strings, e.g.

sprintf '%';

If you don't need to format, just use print:

print Dumper \%oprAtnNOW; 
choroba
  • 231,213
  • 25
  • 204
  • 289
  • 2
    You should mention that using `sprintf` this way is not very useful, and prone to bugs. – TLP Jul 23 '21 at 14:16
  • @choraba, how did you get the tilde under the percent sign in your code snippet? – Jacob Wegelin Jul 23 '21 at 14:17
  • @JacobWegelin It is on the line below. – TLP Jul 23 '21 at 14:18
  • @TLP what way, in particular? I need to generate error messages that tell me exactly where I was in my code and what the structures (arrays, hashes) are when something fails. So I use join and sprintf to create my error messages. Is there some other way I should be doing this? This way, I do not have to issue multiple print STDERR commands for a single error. See the sample error message at the start of my original post. – Jacob Wegelin Jul 23 '21 at 14:20
  • 1
    @JacobWegelin `Dumper` is a function that returns a string. The same way `sprintf` returns a string. You can literally just not use `sprintf` and your code will work. – TLP Jul 23 '21 at 14:21