1

Unfortunately, for some strange reason the regex method isn't working for me with UTF-8 (preg_replace + UTF-8 doesn't work on one server but works on another).

What would be the most efficient way to accomplish my goal without using regex?

Just to make it as clear as possible, for the following set of words:
cat, dog, sky

cats would return false
the sky is blue would return true
skyrim would return false

Community
  • 1
  • 1
Lior
  • 5,454
  • 8
  • 30
  • 38

3 Answers3

1

My initial thought is to explode the text on spaces, and then check to see if your words exist in the resulting array. Of course you may have some punctuation leaking into your array that you'll have to consider as well.

Another idea would be to check the strpos of the word. If it's found, test for the next character to see if it is a letter. If it is a letter, you know that you've found a subtext of a word, and to discard this finding.

// Test online at http://writecodeonline.com/php/

$aWords = array( "I", "cat", "sky", "dog" );
$aFound = array();
$sSentence = "I have a cat. I don't have cats. I like the sky, but not skyrim.";

foreach ( $aWords as $word ) {
  $pos = strpos( $sSentence, $word );
  // If found, the position will be greater than or equal to 0
  if ( !($pos >= 0) ) continue;
    $nextChar = substr( $sSentence , ( $pos + strlen( $word ) ), 1 );
    // If found, ensure it is not a substring
    if ( ctype_alpha( $nextChar ) ) continue;
      $aFound[] = $word;
}

print_r( $aFound ); // Array ( [0] => I [1] => cat [2] => sky )

Of course the better solution is to determine why you cannot use regex, as these solutions will be nowhere near as efficient as pattern-seeking would be.

Sampson
  • 265,109
  • 74
  • 539
  • 565
  • The thing is - is it really the most efficient way when dealing with very large texts? – Lior Apr 10 '12 at 01:35
  • 3
    @Lior The most efficient thing is to figure out how to get regular expressions working. This is nowhere near as efficient as that. – Sampson Apr 10 '12 at 01:37
  • I just can't figure it out for the life of me... I honestly don't have a clue why it's not working, and can't wait any longer, unfortunately I've got to use another solution. – Lior Apr 10 '12 at 01:40
1

Super short example but it's the way I'd do it without Regex.

$haystack = "cats"; //"the sky is blue"; // "skyrim";
$needles = array("cat", "dog", "sky");

$found = false;
foreach($needles as $needle)
    if(strpos(" $haystack ", " $needle ") !== false) {
        $found = true;
        break;
    }


echo $found ? "A needle was found." : "A needle was not found.";
iambriansreed
  • 21,935
  • 6
  • 63
  • 79
  • 1
    Do you by any chance wanted to call [`substr_count`](http://php.net/manual/en/function.substr-count.php) ? ;-) – Basti Apr 10 '12 at 01:54
  • 1
    I also think that `strpos` will perform better in this solution, as Lior is only interessted if the `$heystack` contains a `$needle`, not the amount of occurrences. See http://stackoverflow.com/a/3875258/1220835 – Basti Apr 10 '12 at 01:56
0

If you are simply trying to find if a word is in a string you could store the string in a variable (If printing the string print the variable with the string inside instead) and use "in". Example:

a = 'The sky is blue'
The in a
True