0

This is a similar question to Fastest way to determine if an integer is between two integers (inclusive) with known sets of values, but the accepted answer will not work (as far as I know) in php due to php not being strictly typed and not having controllable integer overflow.

The use case here is to determine if an integer is between 65 and 90 (ASCII values for 'A' and 'Z'). These bounds might help optimize the solution due to 64 being a power of two and acting as boundary condition for this problem.

The only pseudo optimization I have come up with so far is:

//$intVal will be between 0 and 255 (inclusive)
function isCapital($intVal)
{
    //255-64=191 (bit mask of 1011 1111)
    return (($intVal & 191) <= 26) && (($intVal & 191) > 0);
}

This function is not much of an improvement (possibly slower) over a normal double comparison of $intVal >= 65 && $intVal <= 90, but it is just where I started heading while trying to optimize.

Community
  • 1
  • 1
Scott
  • 12,077
  • 4
  • 27
  • 48
  • 2
    mh, I think that if you get to the point where you need to do those optimization, then you picked the wrong language – Federkun Oct 19 '15 at 23:24
  • why not use `if (preg_match("/[A-Z]/",$char,$foo)) {}`? – Wobbles Oct 19 '15 at 23:29
  • 1
    As above, unless this is a purely academic exercise, then you probably want to write a php extention in c – Steve Oct 19 '15 at 23:29
  • @Wobbles, its interesting you bring up preg_match, that is the function that I have been benchmarking my solution against. The full use case has been trying to optimize a `camelCase` string into a snake_case string. So far, I am 20% faster than preg_replace. – Scott Oct 19 '15 at 23:32
  • @steve, this is mostly an academic exercise kind of a like a code golf challenge in php. – Scott Oct 19 '15 at 23:33
  • are you testing character by character, or against a whole string? If you test the whole string for character other than [A-Z] I bet youd be quicker then – Wobbles Oct 19 '15 at 23:34
  • If it really matters, you might get better optimisation if you didn't do `($intVal & 191)` twice – Mark Baker Oct 19 '15 at 23:34
  • Another option is switch case which I believe hashes the possible values and sometimes saves a couple ms, not sure if itll help in this case, but something to test none the less. – Wobbles Oct 19 '15 at 23:37

4 Answers4

5
function isCapitalBitwise($intVal) {
    return (($intVal & 191) <= 26) && (($intVal & 191) > 0);
}
function isCapitalNormal($intVal) {
    return $intVal >= 65 && $intVal <= 90;
}
function doTest($repetitions) {
    $i = 0;
    $startFirst = microtime();
    while ($i++ < $repetitions) {
        isCapitalBitwise(76);
    }
    $first = microtime() - $startFirst;
    $i = 0;
    $startSecond = microtime();
    while ($i++ < $repetitions) {
        isCapitalNormal(76);
    }
    $second = microtime() - $startSecond;
    $i = 0;
    $startThird = microtime();
    while ($i++ < $repetitions) {
        ctype_upper('A');
    }
    $third = $startThird - microtime();
    echo $first . ' ' . $second . ' ' . $third . PHP_EOL;
}
doTest(1000000);

On my system this returns:

0.217393 0.188426 0.856837

PHP is not as good at bitwise operations as compiled languages... but more importantly, I had to do a million comparisons to get less than 3 hundredths of a second of difference.

Even ctype_upper() is well in the range of "you might save a few seconds of CPU time per year" with these other ways of comparison, with the added bonus that you don't have to call ord() first.

Go for readability. Go for maintainability. Write your application, then profile it to see where your real bottlenecks are.

Ghedipunk
  • 1,229
  • 10
  • 22
4

Instead of recreating the wheel, why not use the pre-built php method ctype_upper

$char = 'A';    
echo ctype_upper($char) ? "It's uppercase" : "It's lowercase";

You can even pass in the integer value of a character:

echo ctype_upper($intVal) ? "It's uppercase" : "It's lowercase";

http://php.net/manual/en/function.ctype-upper.php

Even if you do find a method other than comparing via && or what I pasted above, it will be microseconds difference. You will waste hours coming up with a way to save a few seconds in the course of a year.

skrilled
  • 5,350
  • 2
  • 26
  • 48
  • 1
    I wish I could upvote this many more times. Instead: $Premature_Optimization === 'Premature Tar Pit of Death'; – MichaelClark Oct 19 '15 at 23:28
  • I was not aware of this function, this might be exactly what I was looking for! I just need to try it out first to confirm. – Scott Oct 19 '15 at 23:29
1

From How to check if an integer is within a range?:

t1_test1: ($val >= $min && $val <= $max): 0.3823 ms

t2_test2: (in_array($val, range($min, $max)): 9.3301 ms

t3_test3: (max(min($var, $max), $min) == $val): 0.7272 ms

You can also use range with characters (A, B, C...) but as you see it is not a good approach.

Community
  • 1
  • 1
Luis Ávila
  • 689
  • 4
  • 14
1

I think you will get best results by going native, but its only a fraction faster. Use ctype_upper directly. Here are my tests.

<?php
$numTrials = 500000;
$test = array();
for ($ii = 0; $ii < $numTrials; $ii++) {
  $test[] = mt_rand(0, 255);
}

function compare2($intVal) {
   return $intVal >= 65 && $intVal <= 90;
}

$tic = microtime(true);
for ($ii = 0; $ii < $numTrials; $ii++) {
   $result = compare2($test[$ii]);
}
$toc = microtime(true);
echo "compare2...: " . ($toc - $tic) . "\n";

$tic = microtime(true);
for ($ii = 0; $ii < $numTrials; $ii++) {
   $result = ctype_upper($test[$ii]);
}
$toc = microtime(true);
echo "ctype_upper: " . ($toc - $tic) . "\n";

echo "\n";

Which gives something pretty consistently like:

compare2...: 0.39210104942322 
ctype_upper: 0.32374000549316
Victory
  • 5,811
  • 2
  • 26
  • 45