4

I'm trying to encode a string using the Crockford Base32 Algorithm.

Unfortunately, my current code only accepts numeric values as input. I thought of converting the ASCII characters to Decimal or Octal, but then the concatenation of 010 and 100 results in 10100 which makes it impossible to decode this. Is there some way to do this I am not aware of?

Ilmari Karonen
  • 49,047
  • 9
  • 93
  • 153
Stefan
  • 2,164
  • 1
  • 23
  • 40
  • 1
    The docs are passing a string to the function? `Crockford::encode('519571');` – BenM Dec 28 '12 at 20:59
  • 1
    If you look at the source code these get converted to ints – Stefan Dec 28 '12 at 20:59
  • I made a (pretty crappy) Base32 library you can try: https://github.com/NTICompass/PHP-Base32 – gen_Eric Dec 28 '12 at 20:59
  • Actually, they don't. The library checks to see if they're numeric, it doesn't convert them to integers, only the result of maths on the passed argument. – BenM Dec 28 '12 at 21:01
  • 1
    @BenM: If it's not numeric, it won't encode it. Check [this line](https://github.com/dflydev/dflydev-base32-crockford/blob/master/src/Dflydev/Base32/Crockford/Crockford.php#L57) of code. – gen_Eric Dec 28 '12 at 21:02
  • Could someone explain me why there are 2 close votes? – Stefan Dec 28 '12 at 21:06
  • I don't know if my library's Crockford algorithm is correct, I don't think it is. – gen_Eric Dec 28 '12 at 21:08
  • What are you trying to do with this that you can't do with base64 encoding? – Barmar Dec 28 '12 at 21:08
  • I need it to be case insensitive, base64 doesn't support this. – Stefan Dec 28 '12 at 21:09

1 Answers1

9

I believe this should be a more efficient implementation of Crockford Base32 encoding:

function crockford_encode( $base10 ) {
    return strtr( base_convert( $base10, 10, 32 ),
                  "abcdefghijklmnopqrstuv",
                  "ABCDEFGHJKMNPQRSTVWXYZ" );
}

function crockford_decode( $base32 ) {
    $base32 = strtr( strtoupper( $base32 ), 
                     "ABCDEFGHJKMNPQRSTVWXYZILO",
                     "abcdefghijklmnopqrstuv110" );
    return base_convert( $base32, 32, 10 );
}

(demo on codepad.org)

Note that, due to known limitations (or, arguably, bugs) in PHP's base_convert() function, these functions will only return correct results for values that can be accurately represented by PHP's internal numeric type (probably double). We can hope that this will be fixed in some future PHP version, but in the mean time, you could always use this drop-in replacement for base_convert().


Edit: The easiest way to compute the optional check digit is probably simply like this:

function crockford_check( $base10 ) {
    return substr( "0123456789ABCDEFGHJKMNPQRSTVWXYZ*~$=U", $base10 % 37, 1 );
}

or, for large numbers:

function crockford_check( $base10 ) {
    return substr( "0123456789ABCDEFGHJKMNPQRSTVWXYZ*~$=U", bcmod( $base10, 37 ), 1 );
}

We can then use it like this:

function crockford_encode_check( $base10 ) {
    return crockford_encode( $base10 ) . crockford_check( $base10 );
}

function crockford_decode_check( $base32 ) {
    $base10 = crockford_decode( substr( $base32, 0, -1 ) );
    if ( strtoupper( substr( $base32, -1 ) ) != crockford_check( $base10 ) ) {
        return null;  // wrong checksum
    }
    return $base10;
}

(demo on codepad.org)

Note: (July 18, 2014) The original version of the code above had a bug in the Crockford alphabet strings, such that they read ...WZYZ instead of ...WXYZ, causing some numbers to be encoded and decoded incorrectly. This bug has now been fixed, and the codepad.org versions now include a basic self-test routine to verify this. Thanks to James Firth for spotting the bug and fixing it.

Community
  • 1
  • 1
Ilmari Karonen
  • 49,047
  • 9
  • 93
  • 153