16

Or: Should I optimize my string-operations in PHP? I tried to ask PHP's manual about it, but I didn't get any hints to anything.

Henrik Paul
  • 66,919
  • 31
  • 85
  • 96
  • 4
    I would suggest you don't worry about it unless/until it becomes a problem. It's pretty unusual for string operations to be the bottleneck in any modern web application. – Eli Jan 30 '09 at 18:57
  • The only time this has been a consideration for me was when setting each character of a large string by index in a loop. Since PHP strings are immutable this worked great, but only when I initialized the string to the correct size. I think I used str_repeat for the initialization. – Hans Mar 16 '14 at 01:35

6 Answers6

23

PHP already optimises it - variables are assigned using copy-on-write, and objects are passed by reference. In PHP 4 it doesn't, but nobody should be using PHP 4 for new code anyway.

  • ive confirmed this w/ one of the PHP developers – phatduckk Feb 01 '09 at 02:17
  • 15
    I really wish Rich B would stop going around converting valid (British) English spelling into US English... Not all of us are from the US. – James B Mar 03 '09 at 22:33
  • I wish he'd make like a tree, personally. –  Mar 03 '09 at 23:02
  • -1 PHP never passes by reference if it isn't explicitly told to do so. Objects in PHP 5 are no exception. – NikiC Oct 17 '10 at 20:52
  • @nikic: it took zero effort to look this up on the PHP site - “PHP treats objects in the same way as references or handles, meaning that each variable contains an object reference rather than a copy of the entire object.” –  Oct 20 '10 at 20:07
  • 3
    @Ant P: The manual entry is ambiguous. What they want to say is that objects aren't stored in an variable that is being passed around, but that a pointer to the actual data is passed around. But still this pointer is passed by value, not by reference. Easy example that shows that: `function f($obj) { $obj = 'foo'; } $obj = new stdClass; f($obj); var_dump($obj);`. If `$obj` were passed by reference it would print `'foo'`, not `'stdClass'` ;) – NikiC Oct 21 '10 at 10:01
  • I know this is quite an old thread, but NikiC is incorrect. Your example does not work because although the object is passed by reference in to your method, the altered value is within the local scope of the function and as the variable is not an object it would not be accessible outside of that function. As seen with function f($obj) { var_dump($obj); unset($obj);$obj=null;var_dump($obj);} $obj = new stdClass; $obj->test = true; f($obj); var_dump($obj); $obj=null;var_dump($obj); Result is that whilst the reference is wiped in the function, the object still exists outside of it. – corrodedmonkee Oct 04 '11 at 17:11
  • @NikiC What you are misunderstanding is that passing an object as a reference is not the same as making $obj a pointer to some variable. When you set $obj to to 'foo', you discard its old value, which was a reference to an object, and set $obj to a local scope string. The function has then lost the reference to the object, but that does not mean the object is not still there outside of the function, and in this case is unchanged. – Jason Mar 16 '13 at 19:03
  • @Jason You may want to read http://blog.golemon.com/2007/01/youre-being-lied-to.html to properly understand what I meant with my comment ;) – NikiC Mar 16 '13 at 19:14
  • Perhaps "reference like" is a better description. An object passed into a function can be manipulated like it were a reference to that object; a copy of the object is not made. Unlike a true reference (e.g. `$obj2 =& $obj1;`) you cannot set that object to something else, such as a "foo" from within the function. So `function bar($obj) {$obj->p='bar';}` will set the 'p' property of the passed object to 'bar', but `function bar($obj) {$obj='bar';}` will only set the local variable $obj to 'bar' and leave the object untouched. Even `unset($obj);` inside the function does not touch the object. TIL – Jason Mar 17 '13 at 15:28
5

One of the most essential speed optimization techniques in many languages is instance reuse. In that case the speed increase comes from at least 2 factors:

1. Less instantiations means less time spent on construction.

2. The less the amount of memory that the application uses, the less CPU cache misses there probably are.

For applications, where the speed is the #1 priority, there exists a truly tight bottleneck between the CPU and the RAM. One of the reasons for the bottleneck is the latency of the RAM.

The PHP, Ruby, Python, etc., are related to the cache-misses by a fact that even they store at least some (probably all) of the run-time data of the interpreted programs in the RAM.

String instantiation is one of the operations that is done pretty often, in relatively "huge quantities", and it may have a noticeable impact on speed.

Here's a run_test.bash of a measurement experiment:

#!/bin/bash

for i in `seq 1 200`;
do
        /usr/bin/time -p -a -o ./measuring_data.rb  php5 ./string_instantiation_speedtest.php
done

Here are the ./string_instantiation_speedtest.php and the measurement results:

<?php

// The comments on the
// next 2 lines show arithmetic mean of (user time + sys time) for 200 runs.
$b_instantiate=False; // 0.1624 seconds
$b_instantiate=True;  // 0.1676 seconds
// The time consumed by the reference version is about 97% of the
// time consumed by the instantiation version, but a thing to notice is
// that the loop contains at least 1, probably 2, possibly 4,
// string instantiations at the array_push line.
$ar=array();
$s='This is a string.';
$n=10000;
$s_1=NULL;

for($i=0;$i<$n;$i++) {
    if($b_instantiate) {
        $s_1=''.$s;
    } else {
        $s_1=&$s;
    }
    // The rand is for avoiding optimization at storage.
    array_push($ar,''.rand(0,9).$s_1);
} // for

echo($ar[rand(0,$n)]."\n");

?>

My conclusion from this experiment and one other experiment that I did with Ruby 1.8 is that it makes sense to pass string values around by reference.

One possible way to allow the "pass-strings-by-reference" to take place at the whole application scope is to consistently create a new string instance, whenever one needs to use a modified version of a string.

To increase locality, therefore speed, one may want to decrease the amount of memory that each of the operands consumes. The following experiment demonstrates the case for string concatenations:

<?php

// The comments on the
// next 2 lines show arithmetic mean of (user time + sys time) for 200 runs.
$b_suboptimal=False; // 0.0611 seconds
$b_suboptimal=True;  // 0.0785 seconds
// The time consumed by the optimal version is about 78% of the
// time consumed by the suboptimal version.
//
// The number of concatenations is the same and the resultant
// string is the same, but what differs is the "average" and maximum
// lengths  of the tokens that are used for assembling the $s_whole.
$n=1000;
$s_token="This is a string with a Linux line break.\n";
$s_whole='';

if($b_suboptimal) {
    for($i=0;$i<$n;$i++) {
        $s_whole=$s_whole.$s_token.$i;
    } // for
} else {
    $i_watershed=(int)round((($n*1.0)/2),0);
    $s_part_1='';
    $s_part_2='';
    for($i=0;$i<$i_watershed;$i++) {
        $s_part_1=$s_part_1.$i.$s_token;
    } // for
    for($i=$i_watershed;$i<$n;$i++) {
        $s_part_2=$s_part_2.$i.$s_token;
    } // for
    $s_whole=$s_part_1.$s_part_2;
} // else

// To circumvent possible optimization one actually "uses" the
// value of the $s_whole.
$file_handle=fopen('./it_might_have_been_a_served_HTML_page.txt','w');
fwrite($file_handle, $s_whole);
fclose($file_handle);

?>

For example, if one assembles HTML pages that contain considerable amount of text, then one might want to think about the order, how different parts of the generated HTML are concated together.

A BSD-licensed PHP implementation and Ruby implementation of the watershed string concatenation algorithm is available. The same algorithm can be (has been by me) generalized to speed up multiplication of arbitrary precision integers.

Martin Vahi
  • 129
  • 1
  • 5
4

Arrays and strings have copy-on-write behaviour. They are mutable, but when you assign them to a variable initially that variable will contain the exact same instance of the string or array. Only when you modify the array or string is a copy made.

Example:

$a = array_fill(0, 10000, 42);  //Consumes 545744 bytes
$b = $a;                        //   "         48   "
$b[0] = 42;                     //   "     545656   "

$s = str_repeat(' ', 10000);    //   "      10096   "
$t = $s;                        //   "         48   "
$t[0] = '!';                    //   "      10048   "
3

A quick google would seem to suggest that they are mutable, but the preferred practice is to treat them as immutable.

brettkelly
  • 27,655
  • 8
  • 56
  • 72
0

PHP 7.4 used mutable strings:

<?php
$str = "Hello\n";
echo $str;
$str[2] = 'y';
echo $str;

Output:

Hello
Heylo

Test: PHP Sandbox

Vyacheslav
  • 77
  • 10
-5

PHP strings are immutable.

Try this:

    $a="string";
    echo "<br>$a<br>";
    echo str_replace('str','b',$a);
    echo "<br>$a";

It echos:

string
bing
string

If a string was mutable, it would have continued to show "bing".

netrox
  • 5,224
  • 14
  • 46
  • 60
  • 11
    The fact str_replace returns a modified copy of a string doesn't prove a string to be immutable. This demonstrates a string to be mutable however: `$a = 'hello '; echo $a; $a[3] = 'd'; echo $a;` – Anon343224user Jan 30 '14 at 17:10