3

I'm trying to write a script that will take a text string and will allow me replace random words. For example:

$str = "The quick brown fox jumps over the lazy dog";

I will out put as and replace couple words like this:

The quick ______ fox jumps over the ____ dog

I can probably do this by first splitting the string into array

$arr = str_word_count($str, 1);

And then replace $arr[2] and $arr[7].

The issue that I think I'll have if there are non-words in the string, like punctuation:

$str = "The quick brown fox, named Jack, jumps over the lazy dog; and Bingo was his...";

How do I go about resolving this? Ideas?

santa
  • 12,234
  • 49
  • 155
  • 255
  • You mean, for example, "fox," will be replaced instead of "fox"? (you mean this is the problem?) – jpf Feb 27 '21 at 01:57
  • It appears you could use `preg_replace` on each substring. As `preg_replace('/[a-zA-Z0-9]+/',...` etc. Words that are contractions could still be a problem though if single quotes are also possible as punctuation. – jpf Feb 27 '21 at 02:06
  • @jpf Selecting words without the punctuation is not the issue, `str_word_count` already does that. I believe the issue is the reconstruction of the sentence from the resulting array - it would lose all its original punctuation in the word replacement process. Though, `preg_replace` on randomly selected words is a good idea. – El_Vanja Feb 27 '21 at 02:53
  • @santa Is the number of replacements also random? Can the string be made of multiple sentences or will it always be a single one? – El_Vanja Feb 27 '21 at 02:58
  • Thanks for all suggestions. I actually did mean to replace with an underline, instead of other words. Yes the words will be replaced randomly. I'll probably add a count() to check how many letters in a word to replace with the same number of _ (underscore). And yes, the main challenge was to reconstruct the sentence with the original punctuation. – santa Feb 27 '21 at 13:39

2 Answers2

2

You can do it like this:

   $test1 = "test1";
    $test2 = "test2";
    $test3 = "Bingo2";
    // set new words


    $str = "The quick brown fox, named Jack, jumps over the lazy dog; and Bingo was his...";
    $re = explode(" ", $str);
    // split them on space in array $re
    echo $str  . "<br>";
    $num = 0;

    foreach ($re as $key => $value) {
        echo $value . "<br>";
        $word = "";

        switch (true) {
            case (strpos($value, 'Jack') !== false):
                // cheak if each value in array has in it wanted word to replace 
                // and if it does
                $new = explode("Jack", $value);
                // split at that word just to save punctuation
                $word = $test1 . $new[1];
                //replace the word and add back punctuation
                break;
            case (strpos($value, 'dog') !== false):
                $new1 = explode("dog", $value);
                $word = $test2 . $new1[1];
                break;
            case (strpos($value, 'Bingo') !== false):
                $new2 = explode("Bingo", $value);
                $word = $test3 . $new2[1];
                break;
            default:
                $word = $value;
                // if no word are found to replace just leave it
        }

        $re[$num++] = $word;
        //push new words in order back into array
    };


    echo  implode(" ", $re);
        // join back with space

Result:

The quick brown fox, named test1, jumps over the lazy test2; and Bingo2 was his... 

It works with or without punctuation.

But keep in mind if you have Jack and Jacky for example you will need to add additional logic such as checking if punctuation part does not have any letters in it with Regex to match only letters, if it does skip it, it means it was not full match. Or soothing similar.

EDIT (based on comments):

$wordstoraplce = ["Jacky","Jack", "dog", "Bingo","dontreplace"];
$replacewith = "_";
$word = "";
$str = "The quick brown fox, named Jack, jumps over the lazy dog; and Bingo was his...";
echo $str . "<br>";
foreach ($wordstoraplce as $key1 => $value1) {
    $re = explode(" ", $str);
    foreach ($re as $key => $value) {
        if((strpos($value, $value1) !== false)){
            $countn=strlen($value1);
            $new = explode($value1, $value);
            if (!ctype_alpha ($new[1])){
                $word = " " . str_repeat($replacewith,$countn) . $new[1]. " ";
            }else{
                $word = $value;
            }
        }else{
            $word = $value;
        };
        //echo  $word;  
        $re[$key] = $word;      
    };
    $str =  implode(" ", $re);
};
echo $str;

RESULT:

The quick brown fox, named Jack, jumps over the lazy dog; and Bingo was his...
The quick brown fox, named ____, jumps over the lazy ___; and _____ was his... 
ikiK
  • 6,328
  • 4
  • 20
  • 40
  • Sorry, I clarified a bit more in my comment above, but this is a great help and will let me push further. Instead of exact words it'll probably be a string on random keys that will be replaced. – santa Feb 27 '21 at 13:44
  • 1
    @santa Yeah i read it now, this can easily be converted to that usage case i believe. If the number of words to replace is (n), just create a loop for that number of words and use just one `if` or `case` as shown from above and go word by word. – ikiK Feb 27 '21 at 13:54
  • 1
    @santa Not sure what you meant by "random keys that will be replaced" But my last edit can surly be adjusted. – ikiK Feb 27 '21 at 14:53
2

I think a much better apporach would be to use a regex, because you don't just allow commas, but everything which is not a word character. Also regexes are much faster than normal splittings or substrings in loops. My Solution would be:

<?php
function randomlyRemovedWords($str)
{
    $sentenceParts = [];

    $wordCount = preg_match_all("/([\w']+)([^\w']*)/", $str, $sentenceParts, PREG_SET_ORDER);

    for ($i = 0;$i < $wordCount / 4;$i++)
    { //nearly every fourth word will be changed
        $index = rand(0, $wordCount - 1);

        $sentenceParts[$index][1] = preg_replace("/./", "_", $sentenceParts[$index][1]);
    }

    $str = "";
    foreach ($sentenceParts as $part)
    {
        $str .= $part[1] . $part[2];
    }

    return $str;
}

echo randomlyRemovedWords("The quick brown fox, doesn't jumps over, the lazy dog.");
echo "\n<br>\n";
echo randomlyRemovedWords("The quick brown fox, jumps over, the lazy dog.");

which results in

The quick brown ___, _______ jumps over, the ____ dog.
<br>
The quick brown fox, jumps ____, ___ ____ dog.

This way you can be sure to ignore all nonword characters and remove words randomly.

TheBlueOne
  • 486
  • 5
  • 13
  • I like this! I wish I could accept more than one answer. This one is definitely a different approach and I like the brevity. Definitely +1 – santa Feb 27 '21 at 15:04