77

Example: I have a $variable = "_foo", and I want to make absolutely sure that $variable does not start with an underscore "_". How can I do that in PHP? Is there some access to the char array behind the string?

Ram Sharma
  • 8,676
  • 7
  • 43
  • 56
openfrog
  • 40,201
  • 65
  • 225
  • 373
  • 1
    Someone had posted an answer that used a regular expression which surprisingly got downvoted 4 times on the grounds that it was "not an appropriate use of regular expressions". The owner of that answer deleted it due to peer pressure. If input validation is not a good use of regular expressions, I don't know what is. Performance is **not** a significant factor in this use case. If the poster would like to undelete the regular expression answer, I will happily upvote it. – Asaph Dec 26 '09 at 19:41
  • 2
    @Asaph: Regular expressions are over-used, and over-suggested. They are _completely_ over-kill for this use case. I don't see how you can claim that performance is not a significant factor; you certainly don't back it up. – Lightness Races in Orbit Mar 27 '11 at 01:19
  • 1
    I'm afraid I agree with @LightnessRacesinOrbit here, a regex is an overkill, contrived answer to someone who doesn't know PHP basics. I think the regex answerer just went, "Oops!" :) – Henrik Erlandsson May 28 '14 at 21:16
  • @HenrikErlandsson Both regex and non-regex solutions are valid; any performance concerns should be added as a caveat in the body or pointed out in the comments. The OP isn't the only one who benefits from their question; the reader is left to decide which solution to go with. – rath May 29 '14 at 20:51
  • @rath Any overcomplicated answer that doesn't have side effects or straight up bugs is valid, doesn't mean it's a good answer. Especially since the OP also asked about the char array. – Henrik Erlandsson May 30 '14 at 07:15
  • @HenrikErlandsson You make a valid point. Cheers – rath May 30 '14 at 12:31

7 Answers7

140
$variable[0] != "_"

How does it work?

In PHP you can get particular character of a string with array index notation. $variable[0] is the first character of a string (if $variable is a string).

Imran
  • 87,203
  • 23
  • 98
  • 131
Peter Porfy
  • 8,921
  • 3
  • 31
  • 41
132

You might check out the substr function in php and grab the first character that way:

http://php.net/manual/en/function.substr.php

if (substr('_abcdef', 0, 1) === '_') { ... }
Alex Sexton
  • 10,401
  • 2
  • 29
  • 41
  • 5
    PHP 8.0 implements new method `str_starts_with` so it would be now `str_starts_with('_abcdef', '_')`: https://stackoverflow.com/questions/834303/startswith-and-endswith-functions-in-php/64160081#64160081 – Jsowa Oct 01 '20 at 17:12
55

Since someone mentioned efficiency, I've benchmarked the functions given so far out of curiosity:

function startsWith1($str, $char) {
    return strpos($str, $char) === 0;
}
function startsWith2($str, $char) {
    return stripos($str, $char) === 0;
}
function startsWith3($str, $char) {
    return substr($str, 0, 1) === $char;
}
function startsWith4($str, $char){
    return $str[0] === $char;
}
function startsWith5($str, $char){
    return (bool) preg_match('/^' . $char . '/', $str);
}
function startsWith6($str, $char) {
    if (is_null($encoding)) $encoding = mb_internal_encoding();
    return mb_substr($str, 0, mb_strlen($char, $encoding), $encoding) === $char;
}

Here are the results on my average DualCore machine with 100.000 runs each

// Testing '_string'
startsWith1 took 0.385906934738
startsWith2 took 0.457293987274
startsWith3 took 0.412894964218
startsWith4 took 0.366240024567 <-- fastest
startsWith5 took 0.642996072769
startsWith6 took 1.39859509468

// Tested "string"
startsWith1 took 0.384965896606
startsWith2 took 0.445554971695
startsWith3 took 0.42377281189
startsWith4 took 0.373164176941 <-- fastest
startsWith5 took 0.630424022675
startsWith6 took 1.40699005127

// Tested 1000 char random string [a-z0-9]
startsWith1 took 0.430691003799
startsWith2 took 4.447286129
startsWith3 took 0.413349866867
startsWith4 took 0.368592977524 <-- fastest
startsWith5 took 0.627470016479
startsWith6 took 1.40957403183

// Tested 1000 char random string [a-z0-9] with '_' prefix
startsWith1 took 0.384054899216
startsWith2 took 4.41522812843
startsWith3 took 0.408898115158
startsWith4 took 0.363884925842 <-- fastest
startsWith5 took 0.638479948044
startsWith6 took 1.41304707527

As you can see, treating the haystack as array to find out the char at the first position is always the fastest solution. It is also always performing at equal speed, regardless of string length. Using strpos is faster than substr for short strings but slower for long strings, when the string does not start with the prefix. The difference is irrelevant though. stripos is incredibly slow with long strings. preg_match performs mostly the same regardless of string length, but is only mediocre in speed. The mb_substr solution performs worst, while probably being more reliable though.

Given that these numbers are for 100.000 runs, it should be obvious that we are talking about 0.0000x seconds per call. Picking one over the other for efficiency is a worthless micro-optimization, unless your app is doing startsWith checking for a living.

Gordon
  • 312,688
  • 75
  • 539
  • 559
10

This is the most simple answer where you are not concerned about performance:

if (strpos($string, '_') === 0) {
    # code
}

If strpos returns 0 it means that what you were looking for begins at character 0, the start of the string.

It is documented thoroughly here: http://uk3.php.net/manual/en/function.strpos.php

(PS $string[0] === '_' is the best answer)

Anon343224user
  • 584
  • 1
  • 5
  • 17
5
function starts_with($s, $prefix){
    // returns a bool
    return strpos($s, $prefix) === 0;
}

starts_with($variable, "_");
miku
  • 181,842
  • 47
  • 306
  • 310
  • +1 All other solutions barf on empty string input. – Asaph Dec 25 '09 at 21:56
  • 8
    Inefficient - scans the whole string if the prefix is not found immediately. – Seva Alekseyev Dec 25 '09 at 21:57
  • 1
    From the `substr` manual: "If string is less than or equal to start characters long, FALSE will be returned." So substr($foo,0,1) works perfectly with empty strings. – Wim Dec 25 '09 at 21:58
  • As Seva said, this is really too inefficient. If I had to use a function I would go with substr instead of strpos – AntonioCS Dec 26 '09 at 00:16
  • 1
    @Wim: Thanks for pointing that out. Yet another example of PHP featuring an unintuitive, yet convenient behavior. – Asaph Dec 26 '09 at 15:34
  • `if($s==="") return false; else use_other_solution();` That means other solutions can be made empty string-proof too. – luiscubal Jan 07 '10 at 19:27
3

Here’s a better starts with function:

function mb_startsWith($str, $prefix, $encoding=null) {
    if (is_null($encoding)) $encoding = mb_internal_encoding();
    return mb_substr($str, 0, mb_strlen($prefix, $encoding), $encoding) === $prefix;
}
Gumbo
  • 643,351
  • 109
  • 780
  • 844
2

To build on pinusnegra's answer, and in response to Gumbo's comment on that answer:

function has_leading_underscore($string) {

    return $string[0] === '_';

}

Running on PHP 5.3.0, the following works and returns the expected value, even without checking if the string is at least 1 character in length:

echo has_leading_underscore('_somestring').', ';
echo has_leading_underscore('somestring').', ';
echo has_leading_underscore('').', ';
echo has_leading_underscore(null).', ';
echo has_leading_underscore(false).', ';
echo has_leading_underscore(0).', ';
echo has_leading_underscore(array('_foo', 'bar'));

/*
 * output: true, false, false, false, false, false, false
 */

I don't know how other versions of PHP will react, but if they all work, then this method is probably more efficient than the substr route.

Carson Myers
  • 37,678
  • 39
  • 126
  • 176
  • why are you using the strings 'yes'/'no'? The language has booleans. – Chad Apr 13 '11 at 01:06
  • @Chad I can't remember. If I had to guess it'd be because I was directly printing the result and I felt like reading yes/no instead of true/false for some reason. You're right that using booleans would be better, and additionally it would simplify the logic. I'll change it. – Carson Myers Apr 14 '11 at 02:08