I am banging my head over a Perl task in my Natural Language Processing course that we have been assigned to solve.
What they require us to be able to solve with Perl is the following:
Input: the program takes two inputs from stdin in the form and type of; perl program.pl
Processing and Output:
Part 1: the program tokenizes words in filename.txt and stores these words in a hash with their frequency of occurrence
Part 2: the program uses the input for hashing purposes. If the word cannot be found in the hash (thus in the text), prints out zero as the frequency of the word. If the word CAN indeed be found in the hash, prints out the corresponding frequency value of the word in the hash.
I am sure from experience that my script is already able to DO "Part 1" stated above.
Part 2 needs to be accomplished using a Perl sub (subroutine) which takes the hash by reference, along with the to hash for. This was the part that I had some serious trouble with.
First version before major changes Stefan Becker suggested;
#!/usr/bin/perl
use warnings;
use strict;
sub hash_4Frequency
{
my ($hashWord, $ref2_Hash) = @_;
print $ref2_Hash -> {$hashWord}, "\n"; # thank you Stefan Becker, for sobriety
}
my %f = (); # hash that will contain words and their frequencies
my $wc = 0; # word-count
my ($stdin, $word_2Hash) = @ARGV; # corrected, thanks to Silvar
while ($stdin)
{
while ("/\w+/")
{
my $w = $&;
$_ = $";
$f{lc $w} += 1;
$wc++;
}
}
my @args = ($word_2Hash, %f);
hash_4Frequency(@args);
The second version after some changes;
#!/usr/bin/perl
use warnings;
use strict;
sub hash_4Frequency
{
my $ref2_Hash = %_;
my $hashWord = $_;
print $ref2_Hash -> {$hashWord}, "\n";
}
my %f = (); # hash that will contain words and their frequencies
my $wc = 0; # word-count
while (<STDIN>)
{
while (/\w+/)
{
chomp;
my $w = $&;
$_ = $";
$f{$_}++ foreach keys %f;
$wc++;
}
}
hash_4Frequency($_, \%f);
When I execute ' ./script.pl < somefile.txt someWord ' in Terminal, Perl complains (Perl's output for the first version)
Use of uninitialized value $hashWord in hash element at
./word_counter2.pl line 35.
Use of uninitialized value in print at ./word_counter2.pl line 35.
What Perl complains for the second version;
Can't use string ("0") as a HASH ref while "strict refs" in use at ./word_counter2.pl line 13, <STDIN> line 8390.
At least now I know the script can successfully work until this very last point, and it seems something semantic rather than syntactical.
Any further advice on this last part? Would be really appreciated.
P.S.: Sorry pilgrims, I am just a novice in the path of Perl.