143

In PHP what does it mean by a function being binary-safe ?

What makes them special and where are they typically used ?

Zacky112
  • 8,679
  • 9
  • 34
  • 36

3 Answers3

127

It means the function will work correctly when you pass it arbitrary binary data (i.e. strings containing non-ASCII bytes and/or null bytes).

For example, a non-binary-safe function might be based on a C function which expects null-terminated strings, so if the string contains a null character, the function would ignore anything after it.

This is relevant because PHP does not cleanly separate string and binary data.

Michael Borgwardt
  • 342,105
  • 78
  • 482
  • 720
  • 2
    Does that mean that binary safe strings only contain "characters" of length 1 byte? – Charlie Parker Jul 09 '14 at 03:25
  • 3
    @CharlieParker: No, you got that backwards. Binary safety is a property of *functions* which means they process *any* string correctly. The converse would be a string that contains only ASCII characters *and* no null characters - such a string should be processed correctly by any function. – Michael Borgwardt Jul 09 '14 at 06:36
  • maybe I got confused because I was reading the redis protocol for "bulk strings" and it said that they represent a "single binary binary safe" string. I think I understand your post correctly now. However, does it makes sense to say that a string is "binary safe" (as in the example I provided)? – Charlie Parker Jul 09 '14 at 16:05
100

The other users already mentioned what binary safe means in general.

In PHP, the meaning is more specific, referring only to what Michael gives as an example.

All strings in PHP have a length associated, which are the number of bytes that compose it. When a function manipulates a string, it can either:

  1. Rely on that length meta-data.
  2. Rely on the string being null-terminated, i.e., that after the data that is actually part of the string, a byte with value 0 will appear.

It's also true that all string PHP variables manipulated by the engine are also null-terminated. The problem with functions that rely on 2., is that, if the string itself contains a byte with value 0, the function that's manipulating it will think the string has ended at that point and will ignore everything after that.

For instance, if PHP's strlen function worked like C standard library strlen, the result here would be wrong:

$str = "abc\x00abc";
echo strlen($str); //gives 7, not 3!
Gerald
  • 567
  • 1
  • 10
  • 17
Artefacto
  • 96,375
  • 17
  • 202
  • 225
  • 5
    In my test in PHP 7.0, strlen() function is a binary safe function. – Lee Li Oct 28 '16 at 02:24
  • @Artefacto : Are you saying that the built-in PHP function `strlen()` is a **binary-safe** function? I'm confirming from you because on the **PHP Manual** page for the function `strlen()` it's not been mentioned that whether it's a **binary-safe** function or a **non-binary safe** function. This only missing thing from the **PHP Manual** is creating the confusion in my mind so I want to confirm it from you. I'm keenly looking forward to your reply. Thank You. – PHPLover Feb 10 '19 at 19:01
  • @PHPLover yes strlen() is binary safe. run `php -r 'var_dump("\x00\x00\x00");'` to verify, but php's strlen has been binary safe for a **very** long time, since at least php 4.x (that said, there is an abomination called "mb_overload", but lets just pretend that doesn't exist - https://www.php.net/manual/en/mbstring.overload.php ) – hanshenrik Apr 13 '20 at 01:21
64

More examples:

<?php

    $string1 = "Hello";
    $string2 = "Hello\x00World";

    // This function is NOT ! binary safe
    echo strcoll($string1, $string2); // gives 0, strings are equal.

    // This function is binary safe
    echo strcmp($string1, $string2); // gives <0, $string1 is less than $string2.

?>

\x indicates hexadecimal notation. See: PHP strings

0x00 = NULL
0x04 = EOT (End of transmission)

ASCII table to see ASCII char list

Anon30
  • 567
  • 1
  • 6
  • 21
Subscriberius
  • 836
  • 6
  • 10
  • Just to make sure I understood, then `Hello\r\nWORLD` should not be the same as `Hello` if the function is binary safe, right? – Charlie Parker Jul 09 '14 at 02:28
  • Also, how is such a function implemented? Is there a regular expression that checks that its binary safe or does it use a different method? – Charlie Parker Jul 09 '14 at 03:06
  • @Subscriberius : Is the built in function `strlen()` **binary-safe**? – PHPLover Feb 10 '19 at 19:03
  • @PHPNut as the documentation points out `strlen() returns the number of bytes rather than the number of characters in a string.` So if you want to get the number of characters for any given string, use `mb_strlen` instead. – darighteous1 Jul 19 '22 at 12:03