7

Before flagging this as a duplicate, please read below, and check my code * my updated code!

So my problem is that, I have to implement Java/JavaScript '>>>' (Unsigned Right Shift / Zero-fill Right Shift), but I can't get it work exactly the same way.

I've selected the 11 most promising implementations I've found on SO and on the web (links are added as comments in the code) and added a few test cases. Unfortunately NONE of the functions returned the same response as Java/JS to ALL of the tests. (Maybe some of them are only working on 32bit systems)

Live Code + JS+PHP results demo (click Run):
http://phpfiddle.org/main/code/bcv7-bs2q *
http://phpfiddle.org/main/code/dpkw-rxfe

The closest functions are:

// http://stackoverflow.com/a/27263298
function shr9($a,$b) { 
    if($a>=0) return $a>>$b;
    if($b==0) return (($a>>1)&0x7fffffff)*2+(($a>>$b)&1);
    return ((~$a)>>$b)^(0x7fffffff>>($b-1)); 
}

and

// http://stackoverflow.com/a/25467712
function shr11($a, $b) { 
    if ($b > 32 || $b < -32) {
        $m = (int)($b/32);
        $b = $b-($m*32);
    }

    if ($b < 0)
        $b = 32 + $b;

    if ($a < 0) 
    { 
        $a = ($a >> 1); 
        $a &= 2147483647; 
        $a |= 0x40000000; 
        $a = ($a >> ($b - 1)); 
    } else { 
        $a = ($a >> $b); 
    } 
    return $a; 
}

Unfortunately shr9 fails on (-10 >>> -3) and * (32 >> 32), but is the only to pass (-3 >>> 0); and shr11 fails on (-3 >>> 0) and also (32 >>> 32).

Test cases:

         0 >>> 3    == 0 
         3 >>> 0    == 3 
         0 >>> -3   == 0 
        -3 >>> 0    == 4294967293 (in JS); -3 (in Java)  
        10 >>> 3    == 1 
        10 >>> -3   == 0 
       -10 >>> 3    == 536870910 
       -10 >>> -3   == 7 
-672461345 >>> 25   == 107 
        32 >>> 32   == 32 
       128 >>> 128  == 128 

Edit: I found that -3 >>> 0 is equals 4294967293 only in JavaScript (why?), but in Java, it equals -3. Unfortunately, this doesn't change the fact that I still can't get any function to pass all tests.


*BIG UPDATE:

Since PHP 7, bit shift by a negative number is considered to be invalid and causes: "Fatal error: Uncaught ArithmeticError: Bit shift by negative number". According to this, I think we don't have to pass those tests, so I've updated the question and the codes.

frzsombor
  • 2,274
  • 1
  • 22
  • 40

3 Answers3

8

After looking into the two functions from the question ("shr9" and "shr11") and merging/tweaking the good parts, I finally found the solution. All tests passed (I even added more in the demo), and it also works for shifts by a negative number.

[Live Demo]

function unsignedRightShift($a, $b) {
    if ($b >= 32 || $b < -32) {
        $m = (int)($b/32);
        $b = $b-($m*32);
    }

    if ($b < 0) {
        $b = 32 + $b;
    }

    if ($b == 0) {
        return (($a>>1)&0x7fffffff)*2+(($a>>$b)&1);
    }

    if ($a < 0) 
    { 
        $a = ($a >> 1); 
        $a &= 0x7fffffff; 
        $a |= 0x40000000; 
        $a = ($a >> ($b - 1)); 
    } else { 
        $a = ($a >> $b); 
    } 
    return $a; 
}

This code is not just accurate, but fast too.
Benchmark results: 100000 loops in: 0.25 sec
Benchmark test: http://phpfiddle.org/main/code/mj68-1s7e

frzsombor
  • 2,274
  • 1
  • 22
  • 40
  • My inner geek crys for the ability to do this in PHP using vanilla code but performance reasons make it unrealistic. 100,000 loops in 1 second is way too slow for handling numbers. But +1 on your example. Too bad we can't just do a basic cast like in C. – JSON Nov 06 '22 at 19:02
3

As I was really out of ideas, I cloned the Chromium V8 engine and the Mozilla Central repo for getting SpiderMonkey. I started searching for the JS >>> operator, and finally in Mozilla's code, I found an almost 20 years old file (from 1997), js/src/tests/ecma/Expressions/11.7.3.js, which contained the operator-less code for testing "zero-filling bitwise right shift operation". After rewriting this in PHP, this code passed all the tests.

[LiveDemo]

<?php

function ToInteger( $n ) {
  $sign = ( $n < 0 ) ? -1 : 1;

  if ( $n != $n ) {
    return 0;
  }
  if ( abs( $n ) == 0 || abs( $n ) == INF ) {
    return $n;
  }
  return intval( $sign * floor(abs($n)) );
}

function ToInt32( $n ) {
  $sign = ( $n < 0 ) ? -1 : 1;

  if ( abs( $n ) == 0 || abs( $n ) == INF) {
    return 0;
  }

  $n = ($sign * floor( abs($n) )) % pow(2,32);
  $n = ( $n >= pow(2,31) ) ? $n - pow(2,32) : $n;

  return ( $n );
}

function ToUint32( $n ) {
  $sign = ( $n < 0 ) ? -1 : 1;

  if ( abs( $n ) == 0 || abs( $n ) == INF) {
    return 0;
  }

  $n = $sign * floor( abs($n) );
  $n = $n % pow(2,32);

  if ( $n < 0 ){
    $n += pow(2,32);
  }

  return ( $n );
}

function ToUint16( $n ) {
  $sign = ( $n < 0 ) ? -1 : 1;

  if ( abs( $n ) == 0 || abs( $n ) == INF) {
    return 0;
  }

  $n = ( $sign * floor( abs($n) ) ) % pow(2,16);

  if ($n <0) {
    $n += pow(2,16);
  }

  return ( $n );
}

function Mask( $b, $n ) {
  $b = ToUint32BitString( $b );
  $b = substr( $b, strlen($b) - $n );
  $b = ToUint32Decimal( $b );
  return ( $b );
}

function ToUint32BitString( $n ) {
  $b = "";
  for ( $p = 31; $p >=0; $p-- ) {
    if ( $n >= pow(2,$p) ) {
      $b .= "1";
      $n -= pow(2,$p);
    } else {
      $b .= "0";
    }
  }
  return $b;
}

function ToInt32BitString( $n ) {
  $b = "";
  $sign = ( $n < 0 ) ? -1 : 1;

  $b .= ( $sign == 1 ) ? "0" : "1";

  for ( $p = 30; $p >=0; $p-- ) {
    if ( ($sign == 1 ) ? $sign * $n >= pow(2, $p) : $sign * $n > pow(2,$p) ) {
      $b .= ( $sign == 1 ) ? "1" : "0";
      $n -= $sign * pow( 2, $p );
    } else {
      $b .= ( $sign == 1 ) ? "0" : "1";
    }
  }

  return $b;
}

function ToInt32Decimal( $bin ) {
  $r = 0;
  $sign;

  if ( intval($bin[0]) == 0 ) {
    $sign = 1;
    $r = 0;
  } else {
    $sign = -1;
    $r = -(pow(2,31));
  }

  for ( $j = 0; $j < 31; $j++ ) {
    $r += pow( 2, $j ) * intval($bin[31-$j]);
  }

  return $r;
}

function ToUint32Decimal( $bin ) {
  $r = 0;


  for ( $l = strlen($bin); $l < 32; $l++ ) {
    $bin = "0" . $bin;
  }

  for ( $j = 0; $j < 32; $j++ ) {
    $r += pow( 2, $j ) * intval($bin[31-$j]);

  }

  return $r;
}

function RShift( $s, $a ) {
  $s = ToUint32BitString( $s );
  for ( $z = 0; $z < $a; $z++ ) {
    $s = "0" . $s;
  }
  $s = substr( $s, 0, strlen($s) - $a );

  return ToUint32Decimal($s);
}

function UnsignedRightShift( $s, $a ) {
  $s = ToUint32( $s );
  $a = ToUint32( $a );
  $a = Mask( $a, 5 );
  return ( RShift( $s, $a ) );
}

Usage example: UnsignedRightShift(10, 3); ( = 10 >>> 3 )

Disclaimer: I know that this is not even close to a "professional" solution, the performance is bad (4.33s with 110,000 loops; functions in question finish ~0.04s with 110,000 loops), and maybe there are even unnecessary functions in this snippet, but currently I only had time to implement it line by line. If anyone has a better solution, better performance or cleaner code, I'm more than happy to see it.

frzsombor
  • 2,274
  • 1
  • 22
  • 40
0

What this comes down to is each language's concepts of numbers, how they store numbers internally, as well as the overall scope of how each language is able to handle numbers.

Generally speaking, PHP stores all numbers as 64 bit ints, declared internally as a long long. Like in C/C++, a PHP numbers actual value in memory is always the same regardless of signedness. Declaring a int as signed/unsigned only effects how the value is later interpreted in order to allow the concept of negative values. Because of this, in C, all you need to do in order to perform an “unsigned shift” is cast the value to unsigned

unsigned int unsigned_shift_result = (unsigned int)somevalue >> someothrervalue;

In PHP this is done automatically, so -1 >> 0 in PHP is the same as -1 >>> 0 in Java.

Java and Javascript are somewhat different because they both have features for handling the concept of “big numbers”, that is a number that is greater than the systems INT_MAX/INT64_MAX, such as 90000000000000000000000000000000000000000000000000000000000000000000. Because of this a value may be greater than the system can handle using buildin arithmetic and bitwise instructions, so these languages includes a few additional facilities for handling numbers such as an unsigned shift. This is because 90000000000000000000000000000000000000000000000000000000000000000000 and -90000000000000000000000000000000000000000000000000000000000000000000 (as well as how they are handled) have very significant fundamental differences when compared to 1000 and -1000

With that said, Java and Javascript still deal with numbers the same as PHP, C, etc when they are within the range of what the system can handle naively. I don't see why they muddied the water by throwing in an "unsigned shift". No matter how you break it down, your just dealing with a virtual instruction as well as other higher level concepts of how a value is being represented.

JSON
  • 1,819
  • 20
  • 27