-1

im trying to make a censored words script, i don't know why but my script isn't censoring the words properly. the censored status is 80% ~

this is my code:

    #!/usr/bin/perl -w
    use strict;

    my @text;
    my @cencoredText;

    my $file = "blabla\\text.txt";
    open(FH, "<", $file) or die "cant open file";

    while(<FH>)
    {
        push(@text,$_);
    }
    close(FH);

    my $cencoredFile = "blabla\\forbidden.txt";
    open(FH2, "<", $cencoredFile) or die "cant open file";

    while(<FH2>)
    {
        push(@cencoredText,$_);
    }

    close(FH2);

    for(my $i=0; $i<@cencoredText; $i++)
    {
        for(my $j=0; $j<@text; $j++)
        {
            $text[$j] =~ s/${cencoredText[$i]}/censored/g;
        }

    }

the two files open and the perl script get the info from them.. i don't know whats wrong.. thanks!

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Dor12126
  • 13
  • 5
  • 1
    "my script isn't censoring the words properly. the censored status is 80%". What does this mean? What do you mean by "properly"? Have you tried printing out each line, and each word to check, to ensure your text handling is what you think it should be? – The Archetypal Paul Sep 11 '15 at 15:08
  • 3
    You didn't chomp; your 'words' have a newline at the end, so they won't often be matched. – Jonathan Leffler Sep 11 '15 at 15:08
  • i'm using regexboddy and yes i printedoup every single cell in the array and it's look ok – Dor12126 Sep 11 '15 at 15:19
  • 2
    If this is just a homework assignment or a toy script, your approach is fine, but [don't expect a regex-based profanity filter (or any blacklist-type content filter in general) to work 100% in the wild.](http://stackoverflow.com/a/6099598/176646) – ThisSuitIsBlackNot Sep 11 '15 at 15:27
  • this is bit homwork assignment. i want to build words censoring script. what's you way to do it? – Dor12126 Sep 11 '15 at 15:35
  • 1
    Please don't read entire files into memory: it's wasteful and very rarely necessary. Instead, you shjould read and process files line by line. But if you *do* need to read into an array then `my @text = ` is what you want -- there's no need to call `push` in a loop – Borodin Sep 11 '15 at 15:50

1 Answers1

3

To answer your direct question, you need to chomp the newline off of the end of each input line that you read into your two arrays @text and @censoredText:

...
while( <FH> ) {
    chomp;
    push(@text,$_);
}
close(FH);

my $cencoredFile = "blabla\\forbidden.txt";
open(FH2, "<", $cencoredFile) or die "cant open file";

while(<FH2>) {
    chomp;
    push(@cencoredText,$_);
}
...

A few points not directly related to what you asked:

Are arrays really the best data structure choice to indicate that a word should be censored?

I am going to say no. One problem is that to identify words that should be censored, you currently loop through each word in @censoredText then for each of those words you loop through each line of @text. If you have N lines of text and M forbidden words then you be an overall complexity of O(N*M) which is not very good as N and M increase. If you used a hash to represent words that should be censored, you could reduce this to O(max(N,M)).

Alternatively, you could construct a pattern with each forbidden word and do a global substitution across your entire input file.

Hunter McMillen
  • 59,865
  • 24
  • 119
  • 170
  • Hi,i tried that but still, i have uncensored words. in one line i have the word "hi", the line is: "hi my name is x and hi my name is y" and it cencoring it like this: "hi my name is x and cencored my name is y" – Dor12126 Sep 11 '15 at 15:16