3

I am attempting to use Perl's Regular Expression Substitution with evaluation to help make some config files more dynamic during a Clearcase -> Git migration. The Clearcase system is highly dependent on the /vob/ directory, but we need to make this more dynamic for our Jenkins builds to be happy. I'm trying to reduce the likelihood that I break the Clearcase builds while migrating over.

I have a configuration file that is a text file with a path per line:

/vob/config/file1
/vob/config/file2
/vob/config/file3

This configuration does some additional stuff with those configuration files. The orchestration of that "stuff" is managed by a Perl script. I would like to have a few environment variables ("VOB_FOO") that I can override when I run the script.

I'm a complete novice with Perl, so what I thought was to use the Perl environment variable syntax, do a regex on it and evaluate the substitution results in-line as I'm processing the file.

I want my new config file to have explicit $ENV{'VOB_FOO'} entries in the file, so the file would become:

$ENV{'VOB_FOO'}/config/file1   ->    /home/me/foo/config/file1
$ENV{'VOB_FOO'}/config/file2   ->    /home/me/foo/config/file2
$ENV{'VOB_FOO'}/config/file3   ->    /home/me/foo/config/file3

And the resulting regular expression substitution+evaluation would turn into (if VOB_FOO=/home/me/foo) :

$ENV{'VOB_FOO'}/config/file1   ->    /home/me/foo/config/file1
$ENV{'VOB_FOO'}/config/file2   ->    /home/me/foo/config/file2
$ENV{'VOB_FOO'}/config/file3   ->    /home/me/foo/config/file3

My regular expression matches fine and it appears the substitution is working, but the evaluation part of the substitution is not, and I could use some assistance here. I get a successful match, but the substitution comes off as:

$ENV{'VOB_FOO'}/config/file1   ->    $ENV('VOB_FOO'}/config/file1
$ENV{'VOB_FOO'}/config/file2   ->    $ENV('VOB_FOO'}/config/file2
$ENV{'VOB_FOO'}/config/file3   ->    $ENV('VOB_FOO'}/config/file3

Is there any caveat to this evaluation or some way I can make this work correctly? Here is my code:

## See if we need to substitute an environment variable (e.g., is there a $ENV{} anywhere?)
## s - substitute through regular expressions (s/foo/bar/e)
## e modifier evaluates replacement as perl statement

{
    use re 'debugcolor';

    # this is for debugging only - I want to substitute 
    # grab the $ENV('VOB') string from the file and substitute
    # I may have multiple environment variables that I have to 
    # contend with. 
    my $vob = $ENV{'VOB'};  
    print $vob; 
    print "\n";

    my $regexp = qr/(\$ENV\{[\'][\w]*[\']\})/;

    if( $second =~ m/$regexp/ )
    {
        print "Found the regexp; attempting substitution.\n";
        $second =~ s/$regexp/$1/e;  
    }
    else
    {
        print $regexp + "\n";
        print $second + "\n";
        print "Did not find the regexp\n";
    }
}

I am also open for critique or suggestions on a better way to do this - I'm not tied to this approach or code while I'm working through making this happen.

Kevin K
  • 408
  • 2
  • 18
  • Change $second =~ s/$regexp/$1/e; to $second =~ s/$regexp/$vob/e; – Andrey Jun 04 '18 at 15:51
  • @Andrey - I wanted my regexp to find any environment variable string and substitute it (I did not make that clear initially). I'm going to have multiple variables to handle and I don't want to build the logic for each of them into code. You have given me the idea to handle this in a somewhat different way - look for `/vob/`, see if the VOB_FOO environment variable is set, and make the substitution if both are true. – Kevin K Jun 04 '18 at 15:59
  • Can you explain in more detail what your input looks like? I don't quite get where those environment variables are. Do you have literal `$ENV{...}` strings in your input files? Or are those part of the environment that the script runs in? Or both? Please [edit] and add more details. – simbabque Jun 04 '18 at 16:01
  • 2
    Note that `$regexp + "\n"` etc. should be `$regexp . "\n"` – Borodin Jun 04 '18 at 16:05
  • Thanks @Borodin! Jumping around between Python, Groovy, and Perl yesterday and got those messed up. – Kevin K Jun 05 '18 at 11:35

2 Answers2

2

I think all you need is this. Instead of extracting the whole expression, it takes the hash key and uses it on the real %ENV

I've added an alternation so that the hash key may be written with or without quotes, and may have leading or trailing spaces

$second =~ s/\$ENV\{\s*(?|(\w+)|'(\w+)')\s*\}/$ENV{$1}/g
Borodin
  • 126,100
  • 9
  • 70
  • 144
  • I like this solution - it worked for me and was quite elegant. I did modify my approach so that I would pull out the hash key, as the basis of my end solution. I will accept this as the answer. – Kevin K Jun 05 '18 at 11:36
0

With a literal string captured the $1 has mere characters ('$'.'E'.'N'...), which first need be made into a variable name, to then be evaluated. So, need two evals

use warnings;
use strict;
use feature 'say';

my $var = q(a_$ENV{SHELL}_b);   # like $ENV{'VOB'} read from a file

if ( $var =~ s/(\$ENV\{.*?\})/$1/ee ) {  # WARNING: security?
    say $var
}

Since } isn't ever a part of the environment variable name I simply match everything up to } using the non-greedy .*?. See this post for a detailed explanation of ee.

However, note that ee comes with serious security considerations, as it will turn the given string into a variable and eval it, no questions asked. It also doesn't work in taint mode. So use carefully and only in tightly controlled circumstances.

A safer way is to capture the environment variable name itself and then normally have %ENV of it evaluated in the replacement, as Borodin's answer suggests

$second =~ s/\$ENV(\{(.*?)\}/$ENV{$1}/g;

Either way, also note that you don't need to first match then substitute.


The danger is that if the string happens to contain any code it is blindly eval-ed

zdim
  • 64,580
  • 5
  • 52
  • 81
  • Thanks for both the explanation and an alternative. I saw multiple posts about security concerns with eval (and double-eval) on user-supplied code. Since the input file this runs on is a controlled resource, I was less worried about that, but being security conscious is always a good thing! – Kevin K Jun 05 '18 at 11:33
  • Right, your question makes it clear that the resource is safe, or I wouldn't have mentioned `/ee`. However, code tends to evolve and just in general a good warning is due on this feature. Note that `/e` doesn't have any comparable security implications, as the parser never gets involved. It just evaluates code from the source file. – zdim Jun 05 '18 at 17:01