2

I'm trying to validate that an input to my script (csv) has the column names i'm expecting. The catch is that some of the column names have special characters in them (open and closed parenthesis in this case).

I can get the code to work if I use the experimental feature "SmartMatch" but have been warned to not use it. - How can I check if a Perl array contains a particular value?

Why isn't my grep working?

my @valid_column_names = (
    "some (value 1)",
    "some (value 2)",
    "something else",
  ); 

my $key = "some (value 1)"; #Test Case 1
my $key = "something else"; # Test Case 2

foreach my $val (@valid_column_names){
  if ( grep(/^$key$/, @valid_column_names) ){ # <--- Why won't you match!?
    print "IF - key: \"$key\" val: \"$val\"\n";
  } elsif ( $key ~~ @valid_column_names ){
    print "ELSIF - key: \"$key\" val: \"$val\"\n";
  } else {
    print "ELSE - key: \"$key\" val: \"$val\"\n";
  }   
}

Test Case 1 output

ELSE - key: "some (value 1)" val: "some (value 1)"
ELSE - key: "some (value 1)" val: "some (value 2)"
ELSE - key: "some (value 1)" val: "something else"

Test Case 2 output

IF - key: "something else" val: "some (value 1)"
IF - key: "something else" val: "some (value 2)"
IF - key: "something else" val: "something else"
Casa
  • 2,247
  • 2
  • 15
  • 11
  • `$` is not end of string, `\z` is. `$` allows a trailing line feed, so your code would consider `"something else\n"` to be a valid column name. – ikegami Mar 18 '22 at 18:27

2 Answers2

2

The regex match operator expects a regex pattern.

To convert text into a pattern that matches the text, you can use quotemeta.

my $pat = "^" .quotemeta( $text ) . "\\z";

/$pat/

It might be advantageous to compile the pattern up front (e.g. to avoid compiling it repeatedly later).

my $pat = "^" .quotemeta( $text ) . "\\z";
my $re = qr/$pat/;

/$re/

\Q..\E can be used inside double-quoted and regex literals to invoke quotemeta, so the above simplifies to the following:

my $re = qr/^\Q$text\E\z/;

/$re/

In your case, you want to check if a string (the key) is one of many (a valid column name). To achieve this, we can use the following:

my $pat = join "|", map quotemeta, @valid_column_names;
my $re = qr/^(?:$pat)\z/;

$key =~ $re
   or die( "Invalid key \"$key\"\n" );

But I would use a hash.

my %valid_column_names = map { $_ => 1 } @valid_column_names;

$valid_column_names{ $key }
   or die( "Invalid key \"$key\"\n" );
ikegami
  • 367,544
  • 15
  • 269
  • 518
0

As i was writing this question, i noticed that in the tagged question - How can I check if a Perl array contains a particular value? - it says "#value can be any regex. be safe".

I'm guessing that it's interpreting the open and close parenthesis as a capture group.

Playing around with https://regex101.com/ i noticed if i escape the parenthesis, it matches, but it still doesn't in my perl code unless i modify my $key to be...

my $key = "some \\\(value 1\\\)"; # Test Case 3

Test Case 3 output

IF - key: "some \(value 1\)" val: "some (value 1)"
IF - key: "some \(value 1\)" val: "some (value 2)"
IF - key: "some \(value 1\)" val: "something else"
Casa
  • 2,247
  • 2
  • 15
  • 11
  • 2
    Use `\Q` in the regex to escape meta characters and make the match literal. Also, when you use double quoted string, you need to escape doubly as well. – TLP Mar 18 '22 at 18:10