What is the fastest way to find the occurrence of a string in another string?

Question

Possible Duplicate:
Which method is preferred strstr or strpos ?

Hi!

Could you tell me which one is faster:
strstr($mystring, $findme);
OR
strpos($mystring, $findme);
OR
anything else
in finding the - first or any - occurrence of a string in another one?

Does it even matter in performance if I check the occurrence in a case-insensitive mode with stristr() OR stripos()?

In my case it doesn't matter in which exact position the given string is (if any), or how many times it occurs in the other one (if any), the only important question is if it even exists in the other string.

I've already found some comments about differences of speed in various articles (e.g. on php.net, someone says strstr() is faster in case there is a !== false check after strpos), but now I can't decide which is true.

If you know about any better methods of searching a string in another, please let me know!

Thank you very much for the relevant comments!

============

An example:


$mystring = 'blahblahblah';  
$findme = 'bla';  

if(strstr($mystring, $findme)){  
   echo 'got it';  
}  
else{  
   echo 'none';  
}  

echo PHP_EOL;

if(strpos($mystring, $findme) !== false){  
   echo 'got it';  
}  
else{  
   echo 'none';  
}

This is a micro micro optimization in my opinion. But I'm curious the answer ;) — Jason McCreary, Apr 28 '11 at 15:58
Execute them 10.000 times and measure time before and after, you'll know which one is faster. — Capsule, Apr 28 '11 at 16:00
Alix, you're right, I'm sorry for that, I didn't find this one. Capsule: this was a good idea, in the meantime I already made a test, I will post it soon. — Sk8erPeter, Apr 28 '11 at 16:47
aw, didn't think I get a downvote for being a little bit inattentive :(( — Sk8erPeter, Apr 28 '11 at 19:14

score 23 · Accepted Answer · edited Jun 07 '18 at 07:23

strpos seems to be in the lead, I've tested it with finding some strings in 'The quick brown fox jumps over the lazy dog':

strstr used 0.48487210273743 seconds for 1000000 iterations finding 'quick'
strpos used 0.40836095809937 seconds for 1000000 iterations finding 'quick'
strstr used 0.45261287689209 seconds for 1000000 iterations finding 'dog'
strpos used 0.39890813827515 seconds for 1000000 iterations finding 'dog'

<?php

    $haystack = 'The quick brown fox jumps over the lazy dog';

    $needle = 'quick';

    $iter = 1000000;

    $start = microtime(true);
    for ($i = 0; $i < $iter; $i++) {
        strstr($haystack, $needle);
    }
    $duration = microtime(true) - $start;
    echo "<br/>strstr used $duration microseconds for $iter iterations finding 'quick' in 'The quick brown fox jumps over the lazy dog'";

    $start = microtime(true);
    for ($i = 0; $i < $iter; $i++) {
        strpos($haystack, $needle);
    }
    $duration = microtime(true) - $start;
    echo "<br/>strpos used $duration microseconds for $iter iterations finding 'quick' in 'The quick brown fox jumps over the lazy dog'";

    $needle = 'dog';

    $start = microtime(true);
    for ($i = 0; $i < $iter; $i++) {
        strstr($haystack, $needle);
    }
    $duration = microtime(true) - $start;
    echo "<br/>strstr used $duration microseconds for $iter iterations finding 'dog' in 'The quick brown fox jumps over the lazy dog'";

    $start = microtime(true);
    for ($i = 0; $i < $iter; $i++) {
        strpos($haystack, $needle);
    }
    $duration = microtime(true) - $start;
    echo "<br/>strpos used $duration microseconds for $iter iterations finding 'dog' in 'The quick brown fox jumps over the lazy dog'";

?>

Nice job, I believe this bench already was @ http://www.phpbench.com/ (if only it was up)! — Alix Axel, Apr 28 '11 at 16:20
Another one: http://net-beta.net/ubench/index.php?t=strpos1. — Alix Axel, Apr 28 '11 at 16:24
thanks for the benchmark, in the meantime I also made mine: http://bit.ly/mDE7sL . — Sk8erPeter, Apr 28 '11 at 19:13

Jason McCreary · Answer 2 · 2011-04-28T16:16:20.383

16

From the PHP Docs:

Note:

If you only want to determine if a particular needle occurs within haystack, use the faster and less memory intensive function strpos() instead.

I'm willing to take their word for it :)

edited Apr 28 '11 at 16:16

answered Apr 28 '11 at 16:00

Jason McCreary

71,546
23
135
174

Finally a real answer with sources and proofs. – Matthieu Napoli Apr 28 '11 at 16:03
4

@Matthieu, I assume you're being facetious. – Jason McCreary Apr 28 '11 at 16:09
@Matthieu There is no "source and proof" there. "Proof" would be a real-live benchmark with expected input and usage that could be repeatedly run. – Apr 28 '11 at 16:09
wow, thank you, I didn't notice this official info, while it was in front of my eyes, I just would have to open them :D:D – Sk8erPeter Apr 28 '11 at 16:56
@Jason @pst : Ok I got too excited... Maybe not prooved, but if the php doc says so... They know what they are talking about. – Matthieu Napoli Apr 28 '11 at 17:27

Alix Axel · Answer 3 · 2011-04-28T23:17:48.003

6

The faster way is:

if (strpos($haystack, $needle) !== false)

The case insensitive versions should obviouslly be slower (at least 2x slower, I expect).

strncmp() / substr() can possibly perform better iff you're checking if $haystack starts with $needle and if $haystack is considerably long (> hundreds chars or so).

Benchmark:

strpos() vs. strncmp() = short | long

See other benchmarks @ http://net-beta.net/ubench/ (search for strpos).

A pratical example where this kind of optimizations (kind of) do matter - calculating hashcashes:

$count = 0;
$hashcash = sprintf('1:20:%u:%s::%u:', date('ymd'), $to, mt_rand());

while (strncmp('00000', sha1($hashcash . $count), 5) !== 0)
{
    ++$count;
}

$header['X-Hashcash'] = $hashcash . $count;

edited Apr 28 '11 at 23:17

answered Apr 28 '11 at 16:00

Alix Axel

151,645
95
393
500

you find out the duplicate and answered as well. Do not think both are different things? – Shakti Singh Apr 28 '11 at 16:04
@Shakti Singh: No, because this one has another detail (the case insensitivity). – Alix Axel Apr 28 '11 at 16:09
In this case you must remove your comment or answer otherwise it is creating confusion for some persons – Shakti Singh Apr 28 '11 at 16:15
@Shakti Singh: I don't have to do anything, the comment is automatic FYI and the main question is exactly the same - I don't see a reason not to close it. – Alix Axel Apr 28 '11 at 16:18
As your wish I thought it should be – Shakti Singh Apr 28 '11 at 16:20
wow, thanks, it shows that strncmp() is really faster :) I voted it up, I still have to decide which answer to accept :D – Sk8erPeter Apr 28 '11 at 16:45
@Sk8erPeter: No problem, you have to keep in mind that `strncmp()` is not **generally** faster - only if *at least* one of the conditions I mentioned is met - `strpos()` on the other hand is generally faster than any of the alternatives. Either way we are talking about micro optimizations here, it shouldn't matter more than the combined time of all answerers (unless you're doing this like a zillion times or so). :P – Alix Axel Apr 28 '11 at 16:50

score 2 · Answer 4 · answered Apr 28 '11 at 16:04

2

According to the php manpages, strpos is faster and less memory intensive than strstr.

answered Apr 28 '11 at 16:04

Arjan

9,784
1
31
41

score 0 · Answer 5 · answered Apr 28 '11 at 16:01

0

Trick question. They do two different things. One returns the substring, the other returns the starting position of the substring withing the string. The real answer is you are comparing apples to oranges, use which one you need.

answered Apr 28 '11 at 16:01

CrayonViolent

32,111
5
56
79

1

+1 The only answer thus far that doesn't focus on a u-optimization. – Apr 28 '11 at 16:01
3

The question ain't tricky: **fastest way to *find the occurrence of a string in another string***. – Alix Axel Apr 28 '11 at 16:03
1

@Alix Axel And it's ... still a trick question. This can't be answered without a performance analysis as the expected input must be considered (string and match position). The best is to say "use the correct function". – Apr 28 '11 at 16:07
1

@Crayon Violent: I still disagree, `strstr()` internally is the same as `substr()` + `strpos()` so it will always be slower (and consume more memory since it returns a string). Either way, the "correct function" for the OP question is `str[i]pos()`. – Alix Axel Apr 28 '11 at 16:13
It's interesting that in the meantime I tried my own benchmark many times (I will post it), and I had to find out it's not calculable and/or certain that strstr is always slower. For example my last test's result: strstr(): 1.17782998, strpos() !== false: 1.191504, which means strstr was 0.01367402 faster... Hmmm. So maybe it depends. – Sk8erPeter Apr 28 '11 at 16:52
@Crayon Violent: I don't really think it's like comparison of apples to oranges. I just asked for the FASTEST method to find the occurrence of a string in the other one, as Alix already said. As I layed it down in the beginning, I don't care about the exact position or the substring of the string, I was just curious about performance questions when finding something. – Sk8erPeter Apr 28 '11 at 17:01
@Sk8erPeter: Because your computer is not always as fast as the previous millisecond. ;) Good benchmarking is an art and the only way to be sure is to go back in time and start the other test at the exactly same tick as the previous one - which, of course, is impossible. The other (possible) alternative is to study the algorithmic complexity of each function, and, as you may guess from all the answers, this seems to favor `strpos()`. – Alix Axel Apr 28 '11 at 23:26
@Alix Axel: yes, that's right. :) – Sk8erPeter Apr 30 '11 at 13:05

score 0 · Answer 6 · answered Apr 28 '11 at 16:03

0

If the string A against which you want to find an occurrence of a pattern B, then the fastest way is to build a Suffix Tree of A and perform against it searches for B.

answered Apr 28 '11 at 16:03

akappa

10,220
3
39
56

Could you explain a little bit better what you exactly mean? – Sk8erPeter Apr 28 '11 at 19:18

score 0 · Answer 7 · answered Apr 28 '11 at 16:07

I would think that strpos() would be faster because it only returns an integer (or false if no match was found). strstr() returns a string which contains all text after and including the first match.

For case insensitive searches, I would think that these would be slightly slower because they have to perform extra checks ("do the two chars match? if no, is the char a letter? if yes, does it match the lowercase version? if no, does it match the upper case version?", etc)

What is the fastest way to find the occurrence of a string in another string?

An example:

7 Answers7

Benchmark:

Linked