0

I'm trying to split multiple joined words and I've a perl script I've grabbed from How can I split multiple joined words?

The script outputs multiple options but I just need the last, normally the correct one, what should I change in the script to achieve this ?

#!/usr/bin/perl

use strict;

my $WORD_FILE = 'dic_master'; #Change as needed
my %words; # Hash of words in dictionary

# Open dictionary, load words into hash
open(WORDS, $WORD_FILE) or die "Failed to open dictionary: $!\n";
while (<WORDS>) {
  chomp;
  $words{lc($_)} = 1;
}
close(WORDS);

# Read one line at a time from stdin, break into words
while (<>) {
  chomp;
  my @words;
  find_words(lc($_));
}

sub find_words {
  # Print every way $string can be parsed into whole words
  my $string = shift;
  my @words = @_;
  my $length = length $string;

  foreach my $i ( 1 .. $length ) {
    my $word = substr $string, 0, $i;
    my $remainder = substr $string, $i, $length - $i;
    # Some dictionaries contain each letter as a word
    next if ($i == 1 && ($word ne "a" && $word ne "i"));

    if (defined($words{$word})) {
      push @words, $word;
      if ($remainder eq "") {
        print join(' ', @words), "\n";
        return;
      } else {
        find_words($remainder, @words);
      }
      pop @words;
    }
  }

  return;
}

Thanks !

Community
  • 1
  • 1
Pedro Lobito
  • 94,083
  • 31
  • 258
  • 268
  • the question is not easily understandable. could you describe what the script produces and what do you want of it? – sergio Jul 30 '11 at 09:26
  • the script receives an argument, ex: "bemyguest" and split it into words base on a dictionary: be my gu est / be my gue st / be my guest / I just want it to output the last one. – Pedro Lobito Jul 30 '11 at 09:54

2 Answers2

4

Simply replace print in find_words with an assignment into variable and print it after the for loop ends.

bvr
  • 9,687
  • 22
  • 28
1

bvr's answer will address the immediate needs of the problem.

A recommendation is to use exists instead of defined to check if the string is present in the dictionary. This will ensure that non-words such as 'bemyg' will never become keys in the dictionary hash.

Community
  • 1
  • 1
Zaid
  • 36,680
  • 16
  • 86
  • 155