8

I wonder how I can encode from UTF-8 to GSM 03.38 encoding in PHP?

einstein
  • 13,389
  • 27
  • 80
  • 110

3 Answers3

5

Take a look at this and see if it helps?

https://github.com/onlinecity/php-smpp/blob/master/gsmencoder.class.php

mrAfzaL
  • 141
  • 2
  • 12
2

I had the same doubt and i found the solution. @gumbo example and @mrAfzaL mentioned solution:

function utf8_to_gsm0338($string)
{
    $dict = array(
        '@' => "\x00", '£' => "\x01", '$' => "\x02", '¥' => "\x03", 'è' => "\x04", 'é' => "\x05", 'ù' => "\x06", 'ì' => "\x07", 'ò' => "\x08", 'Ç' => "\x09", 'Ø' => "\x0B", 'ø' => "\x0C", 'Å' => "\x0E", 'å' => "\x0F",
        'Δ' => "\x10", '_' => "\x11", 'Φ' => "\x12", 'Γ' => "\x13", 'Λ' => "\x14", 'Ω' => "\x15", 'Π' => "\x16", 'Ψ' => "\x17", 'Σ' => "\x18", 'Θ' => "\x19", 'Ξ' => "\x1A", 'Æ' => "\x1C", 'æ' => "\x1D", 'ß' => "\x1E", 'É' => "\x1F",
        // all \x2? removed
        // all \x3? removed
        // all \x4? removed
        'Ä' => "\x5B", 'Ö' => "\x5C", 'Ñ' => "\x5D", 'Ü' => "\x5E", '§' => "\x5F",
        '¿' => "\x60",
        'ä' => "\x7B", 'ö' => "\x7C", 'ñ' => "\x7D", 'ü' => "\x7E", 'à' => "\x7F",
        '^' => "\x1B\x14", '{' => "\x1B\x28", '}' => "\x1B\x29", '\\' => "\x1B\x2F", '[' => "\x1B\x3C", '~' => "\x1B\x3D", ']' => "\x1B\x3E", '|' => "\x1B\x40", '€' => "\x1B\x65"
    );
    $converted = strtr(preg_replace('/\p{Mn}/u', '', Normalizer::normalize($string, Normalizer::FORM_KD)), $dict);

    // Replace unconverted UTF-8 chars from codepages U+0080-U+07FF, U+0080-U+FFFF and U+010000-U+10FFFF with a single ?
    return preg_replace('/([\\xC0-\\xDF].)|([\\xE0-\\xEF]..)|([\\xF0-\\xFF]...)/m','?',$converted);
}
Community
  • 1
  • 1
Alisson
  • 33
  • 9
0

I open-sourced a library, GsmCharsetConverter that does just that:

use BenMorel\GsmCharsetConverter\Converter;

$converter = new Converter();
$gsmString = $converter->convertUtf8ToGsm(
    'Helló', // input string
    true,    // whether to use transliteration when a close match is available
    '?'      // replace unconvertible chars with this character
);

The output is an unpacked GSM 03.38 string, meaning that each 7-bit char has a leading 0 bit to make it an 8-bit char. You can then pack it into a binary string:

use BenMorel\GsmCharsetConverter\Packer;

$packer = new Packer();
$packedGsmString = $packer->pack($gsmString);
BenMorel
  • 34,448
  • 50
  • 182
  • 322