10

As many of you already know, PHP 5.4 alpha has been released. I have a question regarding the following.

Simplified string offset reading. $str[1][0] is now a legal construct.

How exactly does $str[1][0] work?

EDIT: http://php.net/releases/NEWS_5_4_0_alpha1.txt

hakre
  • 193,403
  • 52
  • 435
  • 836
webjawns.com
  • 2,300
  • 2
  • 14
  • 34
  • Any link to the source you quote? – Felix Kling Jul 05 '11 at 19:51
  • example: http://codepad.viper-7.com/a1VeFo – Gordon Jul 05 '11 at 19:54
  • @Gordon: Ah...makes sense, I guess because of this: *Added array dereferencing support*. But I'm wondering what's the advantage of `$str[1][0]`. It is the same as `$str[1]`. So there is nothing special about it. `$str[1]` returns a string with one character and `[0]` is accessing the character at position `0`. – Felix Kling Jul 05 '11 at 19:56
  • @Felix that's what I'm wondering, too. I cannot make any sense of it. The commit message isnt clearer either: http://marc.info/?l=php-cvs&m=127928075722194 - what's the practical purpose of that? – Gordon Jul 05 '11 at 19:59
  • @Gordon: Well one possible explanation is that it's (as the commit message sais) mainly simplifying (and maybe optimizing?) and the possibility of `$str[1][0]` is just a nice side effect. Furthermore it could be for consistency reasons. (As you can write `$strOffset = $str[0]; $strOffset[0]` in PHP 5.3, but not `$str[0][0]`, which may seem strange.) – NikiC Jul 05 '11 at 20:07
  • 1
    @Gordon... I think the practicality is just a simplified implementation. As a side effect, you can now add `[0]` as many times as you would like. And that actually makes sense (even if it has no practical purpose) because the offset should return a one character string. (PHP has no `char` type.) Edit: What @nikic said. – Matthew Jul 05 '11 at 20:09
  • The tests in the commit message look interesting though. `$str = "abc"; var_dump($str[-1]); var_dump($str[0]); var_dump($str[1]); var_dump($str[2]); var_dump($str[3]); var_dump($str[1][0]); var_dump($str[2][1]);` They just dont work on the viper pad. Maybe its not 5.4 alpha yet. Maybe Harold isnt so wrong at all. – Gordon Jul 05 '11 at 20:10
  • @Gordon, yes but those negative primary and non-zero secondary indices result in "Uninitialized string offset", as one might expect. – Matthew Jul 05 '11 at 20:14
  • @Gordon: It throws notices on snap too ;) Just look at the EXPECTF section below the tests ;) – NikiC Jul 05 '11 at 20:17
  • @Gordon @Felix: Additionally it seems to solve problems when you do strange things to string (like doing `list($a, $b) = $str[0]`) – NikiC Jul 05 '11 at 20:26
  • @nikic true. tested it now. odd patch though :) – Gordon Jul 05 '11 at 20:27

3 Answers3

11

It just means that when reading a string offset PHP returns a string again, on which you again can access an offset. (And on that access yet another offset. It gets funny with $str[0][0][0][0][0][0])

Before PHP 5.4 you would get an "Cannot use string offset as an array" error.

NikiC
  • 100,734
  • 37
  • 191
  • 225
  • 1
    True. But if you do a `var_dump($str[1])` in PHP 5.3 it also says that the value is a string. So it seems to be something internal.... weird. At least this seems not to be related with array dereferencing as I thought first, because otherwise you would get a syntax error in with `$str[0][0]` in PHP 5.3. – Felix Kling Jul 05 '11 at 20:05
  • 1
    @Felix: Array dereferencing is something different. It's the ability to get an array offset of a function return value ;) I.e. `func()[0]`. – NikiC Jul 05 '11 at 20:08
  • I know now... but I thought it might have been related (but know that I think about it, `$str[0][0]` would be more like accessing a multidimensional array, which already worked before.... I guess I should make a break ;)) – Felix Kling Jul 05 '11 at 20:15
  • This is also now valid: `$str = "ABC"; $str[0] = "XYZ";` The result: `$str == "XBC"`. – Matthew Jul 05 '11 at 20:25
  • @nikic, even if that didn't trigger an error, it probably would not actually change the original `$str`. That's more like saying "change the first index of the new string returned by `$str[0]`." That is, you'd technically be changing the value of a temporary string. – Matthew Jul 05 '11 at 20:38
7

This is a side effect, and was mentioned in the proposal here: http://php.markmail.org/thread/yiujwve6zdw37tpv

The feature is speed/optimization of string offsets.

Hi,

Recently I noticed that reading of string offset is performed in two steps. At first special string_offset variant of temporary_variable is created in zend_fetch_dimension_address_read() and then the real string value is created in _get_zval_ptr_var_string_offset().

I think we can create the real string in the first place. This makes 50% speed-up on string offset reading operation and allows to eliminate some checks and conditional brunches in VM.

The patch is attached (don't forget to regenerate zend_vm_execute.h to test it). However it changes behavior in one bogus case. The following code now will emit "b" (currently it generates a fatal error - cannot use string offset as an array).

$str = "abs";
var_dump($str[1][0]);

I think it's not a problem at all. "b" makes sense because "abs"[1] -> "b" and "b"[0] -> "b".

I'm going to commit the patch in case of no objections.

Thanks. Dmitry.

Robert Harvey
  • 178,213
  • 47
  • 333
  • 501
Philip Olson
  • 4,662
  • 1
  • 24
  • 20
1

This can actually create some interesting bugs when you upgrade code from php 5.3 to 5.4.

In 5.3 this construct would return false:

$array = array("This is a string");

echo isset($array[0][0][0]);

In 5.4 this would return true.

kufi
  • 2,418
  • 19
  • 14
  • Interesting. Note that it's the `isset()` that allows this to run on earlier PHP versions, not the array ([demo with `var_dump( isset($foo[0][0]) );`](http://3v4l.org/I6b6J)). [Adding one more `[0]`](http://3v4l.org/IAfMY) is fatal on PHP 5.0 to 5.3. – IMSoP Sep 28 '13 at 02:40