76

In languages like Java and C#, strings are immutable and it can be computationally expensive to build a string one character at a time. In said languages, there are library classes to reduce this cost such as C# System.Text.StringBuilder and Java java.lang.StringBuilder.

Does php (4 or 5; I'm interested in both) share this limitation? If so, are there similar solutions to the problem available?

Chris
  • 3,438
  • 5
  • 25
  • 27

12 Answers12

65

No, there is no type of stringbuilder class in PHP, since strings are mutable.

That being said, there are different ways of building a string, depending on what you're doing.

echo, for example, will accept comma-separated tokens for output.

// This...
echo 'one', 'two';

// Is the same as this
echo 'one';
echo 'two';

What this means is that you can output a complex string without actually using concatenation, which would be slower

// This...
echo 'one', 'two';

// Is faster than this...
echo 'one' . 'two';

If you need to capture this output in a variable, you can do that with the output buffering functions.

Also, PHP's array performance is really good. If you want to do something like a comma-separated list of values, just use implode()

$values = array( 'one', 'two', 'three' );
$valueList = implode( ', ', $values );

Lastly, make sure you familiarize yourself with PHP's string type and it's different delimiters, and the implications of each.

Peter Bailey
  • 105,256
  • 31
  • 182
  • 206
  • 30
    And use single-quotes whenever possible. – Stephen Dec 29 '10 at 23:15
  • 1
    why not double quotes? – Tebe Oct 05 '13 at 15:52
  • 6
    @gekannt Because PHP expands/interprets variables as well as extra escape sequences in strings that are enclosed in double quotes. For example, `$x = 5; echo "x = $x";` would print `x = 5` while `$x = 5; echo 'x = $x';` would print `x = $x`. – samitny Oct 10 '13 at 19:37
  • one can need it to be expanded as well as not to be expanded/interpret, it depends upon the situation – Tebe Oct 11 '13 at 18:31
  • 21
    Bit of a myth, the single quote thing: http://nikic.github.io/2012/01/09/Disproving-the-Single-Quotes-Performance-Myth.html – alimack Apr 03 '14 at 11:48
  • Good info, @alimack, but for the record, this answer isn't about single vs double quotes nor is it about concatenation vs interpolation. It's about using `echo` with parameterized tokens vs concatenated tokens. – Peter Bailey Apr 05 '14 at 22:11
  • Please use always echo with dot, and double quotes where it increases code readability. Such optimizations are evil – Denys Klymenko Oct 11 '17 at 13:12
36

I was curious about this, so I ran a test. I used the following code:

<?php
ini_set('memory_limit', '1024M');
define ('CORE_PATH', '/Users/foo');
define ('DS', DIRECTORY_SEPARATOR);

$numtests = 1000000;

function test1($numtests)
{
    $CORE_PATH = '/Users/foo';
    $DS = DIRECTORY_SEPARATOR;
    $a = array();

    $startmem = memory_get_usage();
    $a_start = microtime(true);
    for ($i = 0; $i < $numtests; $i++) {
        $a[] = sprintf('%s%sDesktop%sjunk.php', $CORE_PATH, $DS, $DS);
    }
    $a_end = microtime(true);
    $a_mem = memory_get_usage();

    $timeused = $a_end - $a_start;
    $memused = $a_mem - $startmem;

    echo "TEST 1: sprintf()\n";
    echo "TIME: {$timeused}\nMEMORY: $memused\n\n\n";
}

function test2($numtests)
{
    $CORE_PATH = '/Users/shigh';
    $DS = DIRECTORY_SEPARATOR;
    $a = array();

    $startmem = memory_get_usage();
    $a_start = microtime(true);
    for ($i = 0; $i < $numtests; $i++) {
        $a[] = $CORE_PATH . $DS . 'Desktop' . $DS . 'junk.php';
    }
    $a_end = microtime(true);
    $a_mem = memory_get_usage();

    $timeused = $a_end - $a_start;
    $memused = $a_mem - $startmem;

    echo "TEST 2: Concatenation\n";
    echo "TIME: {$timeused}\nMEMORY: $memused\n\n\n";
}

function test3($numtests)
{
    $CORE_PATH = '/Users/shigh';
    $DS = DIRECTORY_SEPARATOR;
    $a = array();

    $startmem = memory_get_usage();
    $a_start = microtime(true);
    for ($i = 0; $i < $numtests; $i++) {
        ob_start();
        echo $CORE_PATH,$DS,'Desktop',$DS,'junk.php';
        $aa = ob_get_contents();
        ob_end_clean();
        $a[] = $aa;
    }
    $a_end = microtime(true);
    $a_mem = memory_get_usage();

    $timeused = $a_end - $a_start;
    $memused = $a_mem - $startmem;

    echo "TEST 3: Buffering Method\n";
    echo "TIME: {$timeused}\nMEMORY: $memused\n\n\n";
}

function test4($numtests)
{
    $CORE_PATH = '/Users/shigh';
    $DS = DIRECTORY_SEPARATOR;
    $a = array();

    $startmem = memory_get_usage();
    $a_start = microtime(true);
    for ($i = 0; $i < $numtests; $i++) {
        $a[] = "{$CORE_PATH}{$DS}Desktop{$DS}junk.php";
    }
    $a_end = microtime(true);
    $a_mem = memory_get_usage();

    $timeused = $a_end - $a_start;
    $memused = $a_mem - $startmem;

    echo "TEST 4: Braced in-line variables\n";
    echo "TIME: {$timeused}\nMEMORY: $memused\n\n\n";
}

function test5($numtests)
{
    $a = array();

    $startmem = memory_get_usage();
    $a_start = microtime(true);
    for ($i = 0; $i < $numtests; $i++) {
        $CORE_PATH = CORE_PATH;
        $DS = DIRECTORY_SEPARATOR;
        $a[] = "{$CORE_PATH}{$DS}Desktop{$DS}junk.php";
    }
    $a_end = microtime(true);
    $a_mem = memory_get_usage();

    $timeused = $a_end - $a_start;
    $memused = $a_mem - $startmem;

    echo "TEST 5: Braced inline variables with loop-level assignments\n";
    echo "TIME: {$timeused}\nMEMORY: $memused\n\n\n";
}

test1($numtests);
test2($numtests);
test3($numtests);
test4($numtests);
test5($numtests);

... And got the following results. Image attached. Clearly, sprintf is the least efficient way to do it, both in terms of time and memory consumption. EDIT: view image in another tab unless you have eagle vision. enter image description here

Evilnode
  • 513
  • 4
  • 6
  • 1
    should have 1 more test: similar to `test2` but replace `.` with `,` (without output buffer, of course) – Raptor Nov 21 '13 at 07:26
  • 1
    Very useful, thank you. String concatenation appears to be the way to go. It makes sense that they'd try and optimize the hell out of that. – Chris Middleton Sep 18 '14 at 19:03
16

StringBuilder analog is not needed in PHP.

I made a couple of simple tests:

in PHP:

$iterations = 10000;
$stringToAppend = 'TESTSTR';
$timer = new Timer(); // based on microtime()
$s = '';
for($i = 0; $i < $iterations; $i++)
{
    $s .= ($i . $stringToAppend);
}
$timer->VarDumpCurrentTimerValue();

$timer->Restart();

// Used purlogic's implementation.
// I tried other implementations, but they are not faster
$sb = new StringBuilder(); 

for($i = 0; $i < $iterations; $i++)
{
    $sb->append($i);
    $sb->append($stringToAppend);
}
$ss = $sb->toString();
$timer->VarDumpCurrentTimerValue();

in C# (.NET 4.0):

const int iterations = 10000;
const string stringToAppend = "TESTSTR";
string s = "";
var timer = new Timer(); // based on StopWatch

for(int i = 0; i < iterations; i++)
{
    s += (i + stringToAppend);
}

timer.ShowCurrentTimerValue();

timer.Restart();

var sb = new StringBuilder();

for(int i = 0; i < iterations; i++)
{
    sb.Append(i);
    sb.Append(stringToAppend);
}

string ss = sb.ToString();

timer.ShowCurrentTimerValue();

Results:

10000 iterations:
1) PHP, ordinary concatenation: ~6ms
2) PHP, using StringBuilder: ~5 ms
3) C#, ordinary concatenation: ~520ms
4) C#, using StringBuilder: ~1ms

100000 iterations:
1) PHP, ordinary concatenation: ~63ms
2) PHP, using StringBuilder: ~555ms
3) C#, ordinary concatenation: ~91000ms // !!!
4) C#, using StringBuilder: ~17ms

nightcoder
  • 13,149
  • 16
  • 64
  • 72
  • Java is more or less the same as C# in this. Though the later versions have done some optimization at compile time to help alleviate this. It used to be the case (in 1.4 and earlier, maybe even in 1.6) that if you have 3 or more elements to concatenate, you were better off using a StringBuffer/Builder. Though in a loop, you still need to use the StringBuilder. – A.Grandt Jan 11 '14 at 09:04
  • In other words, PHP was designed for people who don't want to have to worry about low level considerations and it does string buffering internally on the string type. This is not to do with strings being "mutable" on PHP; growing a string's length still requires a memory copy to a larger piece of memory unless you maintain a buffer for it to grow into. – thomasrutter Mar 24 '17 at 22:25
  • BTW this should be the accepted answer. The current top answers don't even actually answer the question. – thomasrutter Mar 24 '17 at 22:30
12

When you do a timed comparison, the differences are so small that it isn't very relevant. It would make more since to go for the choice that makes your code easier to read and understand.

SeanDowney
  • 17,368
  • 20
  • 81
  • 90
  • 2
    Indeed, worrying about this is just outright silly, when there are usually far more important issues to worry about, like database design, big O() analysis, and proper profiling. – DGM Sep 24 '08 at 04:18
  • 2
    That is very true, but I HAVE seen situations in Java and C# where using a mutable string class (vs. s += "blah") have indeed increased performance dramatically. – Pete Alvin Sep 29 '10 at 00:46
  • This kind of performance optimization is important when you have to manipulate a string with hundreds of thousand of characters in a while loop that breaks only when PHP gets out of execution time or memory - my case – Lucas Bustamante Apr 25 '21 at 00:59
10

I know what you're talking about. I just created this simple class to emulate the Java StringBuilder class.

class StringBuilder {

  private $str = array();

  public function __construct() { }

  public function append($str) {
    $this->str[] = $str;
  }

  public function toString() {
    return implode($this->str);
  }

}
Jan Turoň
  • 31,451
  • 23
  • 125
  • 169
ossys
  • 4,157
  • 5
  • 32
  • 35
  • 9
    Nice solution. At the end of the `append` function you can add `return $this;` to allow method chaining: `$sb->append("one")->append("two");`. – Jabba Dec 03 '10 at 21:20
  • 8
    This is completely unnecessary in PHP. In fact, I'm willing to bet that this is significantly slower than doing regular concatenation. – ryeguy Apr 27 '11 at 14:55
  • 10
    ryeguy: true, being that strings are mutable in PHP this method is "unnecessary", the person asked for a similar implementation to Java's StringBuilder, so here you go... I wouldn't say it's "significantly" slower, I think you're being a little dramatic. The overhead of instantiating a class that manages the string building may include costs, but the usefulness of the StringBuilder class can be expanded to include additional methods on the string. I'll look into what additional overhead is realized by implementing something like this in a class and try to post back. – ossys May 11 '11 at 03:22
  • 7
    ... and he was never heard from again. – Nigralbus Nov 07 '13 at 15:59
6

PHP strings are mutable. You can change specific characters like this:

$string = 'abc';
$string[2] = 'a'; // $string equals 'aba'
$string[3] = 'd'; // $string equals 'abad'
$string[5] = 'e'; // $string equals 'abad e' (fills character(s) in between with spaces)

And you can append characters to a string like this:

$string .= 'a';
Paige Ruten
  • 172,675
  • 36
  • 177
  • 197
  • I'm no expert in php. Is "$string .= 'a'" not a short form of "$string = $string . 'a'" and is php not creating a new string (and not changing the old one)? – Wolfgang Adamec Feb 18 '13 at 09:02
  • Yes it is a short form. But to your second question, PHP's internal behaviour is such that it's effectively like replacing the string with one that's a byte longer. Internally though, it does buffering like StringBuilder. – thomasrutter Mar 24 '17 at 22:26
4

I wrote the code at the end of this post to test the different forms of string concatenation and they really are all almost exactly equal in both memory and time footprints.

The two primary methods I used are concatenating strings onto each other, and filling an array with strings and then imploding them. I did 500 string additions with a 1MB string in php 5.6 (so the result is a 500MB string). At every iteration of the test, all memory and time footprints were very very close (at ~$IterationNumber*1MB). The runtime of both tests was 50.398 seconds and 50.843 seconds consecutively which are most likely within acceptable margins of error.

Garbage collection of strings that are no longer referenced seems to be pretty immediate, even without ever leaving the scope. Since the strings are mutable, no extra memory is really required after the fact.

HOWEVER, The following tests showed that there is a different in peak memory usage WHILE the strings are being concatenated.

$OneMB=str_repeat('x', 1024*1024);
$Final=$OneMB.$OneMB.$OneMB.$OneMB.$OneMB;
print memory_get_peak_usage();

Result=10,806,800 bytes (~10MB w/o the initial PHP memory footprint)

$OneMB=str_repeat('x', 1024*1024);
$Final=implode('', Array($OneMB, $OneMB, $OneMB, $OneMB, $OneMB));
print memory_get_peak_usage();

Result=6,613,320 bytes (~6MB w/o the initial PHP memory footprint)

So there is in fact a difference that could be significant in very very large string concatenations memory-wise (I have run into such examples when creating very large data sets or SQL queries).

But even this fact is disputable depending upon the data. For example, concatenating 1 character onto a string to get 50 million bytes (so 50 million iterations) took a maximum amount of 50,322,512 bytes (~48MB) in 5.97 seconds. While doing the array method ended up using 7,337,107,176 bytes (~6.8GB) to create the array in 12.1 seconds, and then took an extra 4.32 seconds to combine the strings from the array.

Anywho... the below is the benchmark code I mentioned at the beginning which shows the methods are pretty much equal. It outputs a pretty HTML table.

<?
//Please note, for the recursion test to go beyond 256, xdebug.max_nesting_level needs to be raised. You also may need to update your memory_limit depending on the number of iterations

//Output the start memory
print 'Start: '.memory_get_usage()."B<br><br>Below test results are in MB<br>";

//Our 1MB string
global $OneMB, $NumIterations;
$OneMB=str_repeat('x', 1024*1024);
$NumIterations=500;

//Run the tests
$ConcatTest=RunTest('ConcatTest');
$ImplodeTest=RunTest('ImplodeTest');
$RecurseTest=RunTest('RecurseTest');

//Output the results in a table
OutputResults(
  Array('ConcatTest', 'ImplodeTest', 'RecurseTest'),
  Array($ConcatTest, $ImplodeTest, $RecurseTest)
);

//Start a test run by initializing the array that will hold the results and manipulating those results after the test is complete
function RunTest($TestName)
{
  $CurrentTestNums=Array();
  $TestStartMem=memory_get_usage();
  $StartTime=microtime(true);
  RunTestReal($TestName, $CurrentTestNums, $StrLen);
  $CurrentTestNums[]=memory_get_usage();

  //Subtract $TestStartMem from all other numbers
  foreach($CurrentTestNums as &$Num)
    $Num-=$TestStartMem;
  unset($Num);

  $CurrentTestNums[]=$StrLen;
  $CurrentTestNums[]=microtime(true)-$StartTime;

  return $CurrentTestNums;
}

//Initialize the test and store the memory allocated at the end of the test, with the result
function RunTestReal($TestName, &$CurrentTestNums, &$StrLen)
{
  $R=$TestName($CurrentTestNums);
  $CurrentTestNums[]=memory_get_usage();
  $StrLen=strlen($R);
}

//Concatenate 1MB string over and over onto a single string
function ConcatTest(&$CurrentTestNums)
{
  global $OneMB, $NumIterations;
  $Result='';
  for($i=0;$i<$NumIterations;$i++)
  {
    $Result.=$OneMB;
    $CurrentTestNums[]=memory_get_usage();
  }
  return $Result;
}

//Create an array of 1MB strings and then join w/ an implode
function ImplodeTest(&$CurrentTestNums)
{
  global $OneMB, $NumIterations;
  $Result=Array();
  for($i=0;$i<$NumIterations;$i++)
  {
    $Result[]=$OneMB;
    $CurrentTestNums[]=memory_get_usage();
  }
  return implode('', $Result);
}

//Recursively add strings onto each other
function RecurseTest(&$CurrentTestNums, $TestNum=0)
{
  Global $OneMB, $NumIterations;
  if($TestNum==$NumIterations)
    return '';

  $NewStr=RecurseTest($CurrentTestNums, $TestNum+1).$OneMB;
  $CurrentTestNums[]=memory_get_usage();
  return $NewStr;
}

//Output the results in a table
function OutputResults($TestNames, $TestResults)
{
  global $NumIterations;
  print '<table border=1 cellspacing=0 cellpadding=2><tr><th>Test Name</th><th>'.implode('</th><th>', $TestNames).'</th></tr>';
  $FinalNames=Array('Final Result', 'Clean');
  for($i=0;$i<$NumIterations+2;$i++)
  {
    $TestName=($i<$NumIterations ? $i : $FinalNames[$i-$NumIterations]);
    print "<tr><th>$TestName</th>";
    foreach($TestResults as $TR)
      printf('<td>%07.4f</td>', $TR[$i]/1024/1024);
    print '</tr>';
  }

  //Other result numbers
  print '<tr><th>Final String Size</th>';
  foreach($TestResults as $TR)
    printf('<td>%d</td>', $TR[$NumIterations+2]);
  print '</tr><tr><th>Runtime</th>';
    foreach($TestResults as $TR)
      printf('<td>%s</td>', $TR[$NumIterations+3]);
  print '</tr></table>';
}
?>
Dakusan
  • 6,504
  • 5
  • 32
  • 45
4

I just came across this problem:

$str .= 'String concatenation. ';

vs.

$str = $str . 'String concatenation. ';

Seems noone has compared this so far here. And the results are quite crazy with 50.000 iterations and php 7.4:

String 1: 0.0013918876647949

String 2: 1.1183910369873

Faktor: 803 !!!

$currentTime = microtime(true);
$str = '';
for ($i = 50000; $i > 0; $i--) {
    $str .= 'String concatenation. ';
}
$currentTime2 = microtime(true);
echo "String 1: " . ( $currentTime2 - $currentTime);

$str = '';
for ($i = 50000; $i > 0; $i--) {
    $str = $str . 'String concatenation. ';
}
$currentTime3 = microtime(true);
echo "<br>String 2: " . ($currentTime3 - $currentTime2);

echo "<br><br>Faktor: " . (($currentTime3 - $currentTime2) / ( $currentTime2 - $currentTime));

Can someone confirm this? I run into this because I was deleting some lines from a big file by reading and only attaching the wanted lines to a string again.

Using .= was solving all my problems here. Before I got a timeout!

mdempfle
  • 41
  • 2
  • Confirmed, I've noticed the some thing concatenating 40k rows of sql into a string on an embedded device. The difference is huge! – Geoffrey VL Jul 15 '22 at 12:03
  • If you have Opcache enabled, as well as opcache.optimization_level set to some non-zero value (mine is 0x7FFEBFFF), then both tests are about the same speed. – Mike Richardson Sep 02 '23 at 09:32
2

Yes. They do. For e.g., if you want to echo couple of strings together, use

echo str1,str2,str3 

instead of

echo str1.str2.str3 
to get it a little faster.
mixdev
  • 2,724
  • 2
  • 30
  • 25
1

Firstly, if you don't need the strings to be concatenated, don't do it: it will always be quicker to do

echo $a,$b,$c;

than

echo $a . $b . $c;

However, at least in PHP5, string concatenation is really quite fast, especially if there's only one reference to a given string. I guess the interpreter uses a StringBuilder-like technique internally.

Anthony Williams
  • 66,628
  • 14
  • 133
  • 155
0

If you're placing variable values within PHP strings, I understand that it's slightly quicker to use in-line variable inclusion (that's not it's official name - I can't remember what is)

$aString = 'oranges';
$compareString = "comparing apples to {$aString}!";
echo $compareString
   comparing apples to oranges!

Must be inside double-quotes to work. Also works for array members (i.e.

echo "You requested page id {$_POST['id']}";

)

cori
  • 8,666
  • 7
  • 45
  • 81
-4

no such limitation in php, php can concatenate strng with the dot(.) operator

$a="hello ";
$b="world";
echo $a.$b;

outputs "hello world"

paan
  • 7,054
  • 8
  • 38
  • 44
  • 4
    people here is quick on the trigger.. i was typing in the dark.. accidentally hit tab then enter.. – paan Sep 23 '08 at 21:45