1

I have a Perl script that loads each line of a text file into a hash.

&load_map($word_file,"words");

sub load_map {
    my ($file_path,$assoc) = @_;
    open FILE, $file_path or die "Cannot open file $file_path";

    while ($line = <FILE>) {
    chomp $line;
        $$assoc{$line}=1;
    }
    close FILE;
}

Later on in the script, I look in the hash to see if a word is there before I do something:

    if($words{$wordKey}) {
        do something...
    }

This works fine when I am running the script in my local environment. When I am running this script in a docker container, $words{$wordKey} has no value or isn't found so my code that I need to run never executes.

print $words{$wordKey} is empty in a docker container. It is 1 in my local environment.

The strange thing is, I know the hash data does exist in the docker container because if I loop through the entire hash and look at the keys 1 by 1, I can see each key and eventually get to the key I'm looking for, but this defeats the entire purpose of a hash. I shouldn't have to loop through the entire hash from beginning until I find my key.

    while( my( $key, $value ) = each %words ){
        if($key == $wordKey) {
            Do something
        }
    }

Has anyone ever come across this problem and understand what I may or may not be doing that is causing this issue? It's driving me crazy.

===============================================================

New code and output

CASE 1:

&preload_words($word_file);

sub preload_words {
    my %localhash;
    my ($file) = @_;
    open my $fh, '<', $file or die "Can't open $file: $!";

    my $avalue;
    while (my $line = <$fh>) {
        chomp $line;
        $localhash{$line}=1;

        print "word: $line, local value: $localhash{$line} \n";
        $avalue = $line;
    }
    close $fh;

    print "a word: $avalue, a value: $localhash{$avalue} \n";

    my $bvalue = $localhash{'written'};
    print "b word: written, b value: $bvalue \n";
}


outputs:

...
word: worn, local value: 1
word: win, local value: 1
word: won, local value: 1
word: write, local value: 1
word: wrote, local value: 1
word: written, local value: 1 <-- last print from loop
a word: written, a value: 1 <-- avalue last dynamically set from loop
b word: written, b value: 1 <-- hardcoded last word of loop print

CASE 2:

&preload_words($word_file);

sub preload_words {
    my %localhash;
    my ($file) = @_;
    open my $fh, '<', $file or die "Can't open $file: $!";

    my $avalue;
    while (my $line = <$fh>) {
        chomp $line;
        $localhash{$line}=1;

        print "word: $line, local value: $localhash{$line} \n";
        $avalue = $line;
    }
    close $fh;

    print "a word: $avalue, a value: $localhash{$avalue} \n";

    my $bvalue = $localhash{'wrote'};
    print "b word: wrote, b value: $bvalue \n";
}

...
word: worn, local value: 1
word: win, local value: 1
word: won, local value: 1
word: write, local value: 1
word: wrote, local value: 1 <-- prints value in loop but nothing outside of loop scope.
word: written, local value: 1 <-- last print from loop
a word: written, a value: 1 <-- avalue last dynamically set from loop
b word: wrote, b value:  <-- hardcoded 2nd to last word of loop print returns no value

It appears to be some sort of scoping issue in the container even within the same function. Once I'm out of the while loop, I can only dereference the very last index directly. Thoughts?

adhoc
  • 177
  • 2
  • 12
  • I don't know what a "_docker container_" does but you are using symbolic references, what you shouldn't ever do (and have no reason for!). The problem is with passing a string `"words"` and making a variable name out of it -- with `use strict;` that won't even compile. Instead, pass a hash reference, like you would in any other language, and work with that (or build the hash in the sub and return it, or its reference). There's a good chance that the error will go away. (I don't understand the rest of the question...) – zdim Sep 15 '19 at 22:39
  • One difference: `$words{$wordKey}` performs a string comparison, but `$key == $wordKey` is a numerical comparison. You may need to normalize the values e.g. using `$words{0+$wordKey}`. – ikegami Sep 15 '19 at 23:58
  • "_I know the hash data does exist in the docker container_" --- How do you know? Recall that once you ask for a key it will be created, _autovivified_; so after `if ($h{k})` that key `k` _does exist_, even if it didn't exist before. If the code you show at the end has a `$value` for `$wordKey` then I don't see how that first `if` could fail. But the key itself could've been autovivified when you checked even if not created in the sub. As for why-not-created ... I don't see any clue here, except for symbolic references (and apparent use of globals and no `use strict;` but those aren't errors). – zdim Sep 16 '19 at 08:33
  • 1
    I'm pretty sure that the Docker container has nothing to do with this problem. I suspect that the problem is down to your strange use of symbolic references. – Dave Cross Sep 16 '19 at 12:13
  • 1
    I replaced the code as suggested by zdim passing in a reference to a global hash I created. It still only works on my local machine and not in the container. I know the hash does exist because I can loop through the hash and print up every value of each key. I just can't get the value of the key directly no matter how I dereference it. – adhoc Sep 16 '19 at 15:49
  • "_reference to a global hash_" -- Let's then try one more thing: make that hash lexical (`my %hash` at the place where the hash need be declared), instead of global. These may well be scoping problems? (If so, having no globals will force to clean that up.) Passing around a reference to a global may have strange effects, depending on what's done with it (as it is also seen everywhere without being passed!). It's difficult to say without seeing the code. – zdim Sep 16 '19 at 16:59
  • "_print up every value of each key. I just can't get the value of the key directly_" -- let's try another thing: wherever you "_print up every value of each key_" make sure to also "_print directly_" (so right next to each other), and vice versa -- where you try and can't "_print directly_" add the print of every key+value. (Because I don't know how it can be that you can print key+value but can't dereference a key, other than by having in fact something else going on.) Another note: print your hash using `Dumper` module or such (I like `Data::Dump` with its `dd` and `pp` which isn't core) – zdim Sep 16 '19 at 17:03
  • @zdim I refactored my code and edited my original post with 2 cases and outputs for them both. There seems like a scoping issue as shown in case 2. Thanks for all the help so far. – adhoc Sep 16 '19 at 22:39
  • Ah, great! Will look at it a little later, need to run now ... one thing: you have `%localhash` but write with `$localhash->{...}` what makes `$localhash` a hash-reference, a scalar! (With `use strict;` it won't allow undeclared variables and will complain about `$localhash`. Without it, it just makes it right there, and as a global.) You want `$localhash{$line}=1;` instead. (This is on a quick glance...) – zdim Sep 16 '19 at 23:04
  • Hey thanks for looking when you have a moment. I updated the code as you said, but the same output still occurs. There is definitely a scoping issue but I have no idea how to address it due to my limited knowledge of Perl. – adhoc Sep 16 '19 at 23:14

2 Answers2

5

I don't know what a "docker container" does but you are using symbolic references (to make a variable name out of a string) and you should never do that so lightly ... if ever; see about it in perlfaq7, and in this article series and this post (for starters).

Besides, there isn't even a hint of a reason for it here -- instead of using a string literal ("words") to make a hash name just pass a hash reference. Or build a hash in the sub and return it or its reference

sub load_map {
    my ($file) = @_;
    open my $fh, '<', $file or die "Can't open $file: $!";

    my %assoc;    
    while (my $line = <$fh>) {
        chomp $line;
        $assoc{$line} = 1;
    }
    return \%assoc;
}

and get it in the caller for example as

my %words = %{ load_map('filename') };

If there is a reason for the hash to exist in the caller before this then pass its reference to the sub

my %words;
...
load_map('filename', \%words);

and in the sub work with the hash using that

sub load_map { 
    my ($file, $assoc) = @_;
    ...
    while (my $line = <$fh>) { 
        chomp $line;
        $assoc->{$line} = 1
    }
    return 1;
}

Note that, once you've opened the file to $fh, you can populate your hash simply by

my %assoc = map { chomp; $_ => 1 } <$fh>;

since the readline (aka <>) in list context, imposed here by map, returns all lines from the file.

However, the explicit line-by-line code above allows you to check as you go, etc.

I don't understand the rest of the question (appears as an explanation?), but what I addressed here is clearly off and looking for trouble (the "docker" problem may be due to that?); it need be corrected.

zdim
  • 64,580
  • 5
  • 52
  • 81
2

I ended up printing the hash key using Data::Dump as zdim suggested and it showed that when building the hashkey from a file, it was storing it as someword\r.

Chomp had removed only the \n from the end leaving the \r as part of the key which ultimately caused all dereference of the key or comparison to never succeed.

I will keep the suggested code changes you all suggested but wanted to inform anyone interested about the underlying reason for the bug.

Thanks again for all of your help and suggestions!

adhoc
  • 177
  • 2
  • 12