1

I'm experiencing some issues with character encoding using Ratchet PHP and WebSockets.

The Code

I have a server set up using Racthet PHP. I have removed any unrelated code from this:

<?php

require __DIR__.'/bootstrap/autoload.php';

// Ensure correct character encoding...
ini_set('default_charset', 'utf-8');
setlocale(LC_CTYPE, 'en_GB.UTF-8');
mb_internal_encoding("UTF-8");

use Ratchet\MessageComponentInterface;
use Ratchet\ConnectionInterface;
use Ratchet\Http\HttpServerInterface;
use Guzzle\Http\Message\RequestInterface;

class WebSocketDelegate implements MessageComponentInterface {
    protected $clients;

    public function __construct() {
        $this->clients = new \SplObjectStorage;
    }

    public function onOpen(ConnectionInterface $connection) {
        // 
    }

    public function onMessage(ConnectionInterface $from, $msg) {
        // 
    }

    public function onClose(ConnectionInterface $connection) {
        //
    }

    public function onError(ConnectionInterface $connection, \Exception $e) {
        //
    }
}

class RequestInterceptor implements HttpServerInterface {
    private $delegate;

    public function __construct(HttpServerInterface $httpServer) {
        $this->delegate = $httpServer;
    }

    public function onOpen(ConnectionInterface $connection, RequestInterface $request = null) {
        $this->delegate->onOpen($connection, $request);
    }

    public function onMessage(ConnectionInterface $connection, $message) {
        print PHP_EOL . 'Message received, dumping data...' . PHP_EOL;
        var_dump($message);
    }

    public function onClose(ConnectionInterface $connection) {
        //
    }

    public function onError(ConnectionInterface $connection, \Exception $e) {
        //
    }
}

$hostname = 'localhost';
$port = 8889;

$app = \Ratchet\Server\IoServer::factory(
    new \Ratchet\Http\HttpServer(
        new RequestInterceptor(
            new \Ratchet\WebSocket\WsServer(
                new WebSocketDelegate()
            )
        )
    ),
    $port,
    $hostname
);

$app->run();

This is located in the <head> of my HTML:

<meta charset="utf-8" />

I have also created a wrapper class around the WebSocket JavaScript class that, among other unrelated things, converts my request into JSON and sends it via the WebSocket.

The Problem

Here is a screenshot of the JS Console log, which shows the exact JSON string being sent over the connection. All is fine here:

enter image description here

And here is a screenshot of the command line output from the PHP WebSocket server. As you can see, the message being var_dump'd is malformed. I have also tried printing it plainly with print, which makes no difference.

enter image description here

But, the interesting thing to note here is that the malformed characters are quite different every time And also, when the fourth message is dumped using var_dump, the part denoting the type and size of the variable that is prepended by the var_dump function is also malformed. I'm not sure if this means anything or is just an artifact of the malformed characters.

I also tried just sending the string 'test' through the WebSocket to the server, and then printed that message out. Again, each time the message is printed, the malformed characters are different every time, and the supposed length of the string 'test' is 10 characters, every time.

Other Details

When I pass malformed string to mb_detect_encoding it is unable to determine the encoding.

I know that this must be related to Ratchet PHP because previously I was using a different PHP WebSocket library called 'Hoa'. This did not exhibit any of the above character encoding issues, but did have another problem which made me move to RatchetPHP.

I'm not sure, but possibly related, is the fact that when I installed RatchetPHP with Composer, it warned me that RacthedPHP is using a version of Guzzle that is 'abandoned'. But I can't imagine it is, because updates have been released to RatchetPHP since it being abandoned, and I would imagine the developer of RatchetPHP would change it to the new Guzzle if it were a problem.

Also note that, when I printed the headers that are send on the initial handshake, all is fine and there are no malformed characters. Only the messages sent thereafter are malformed.

Environment

  • Windows 10 x64
  • Mostly stock XAMPP installation with some vhosts added in, memory_limit increased, default MySQL collation changed etc.
  • PHP 7.1.1

Things I've Tried

  • As seen in the PHP above, I've set the encoding in several different ways to ensure PHP sets the correct encoding.
  • Tried different browsers, same result
  • Everything suggested here
  • Capturing output and writing to a file to ensure it's not related to command prompt. Same result when written to file and opened in a text editor.

I'd try more things if I knew what else to try. I'm at a bit of a loss.

What on earth could be causing this?

thephpdev
  • 1,097
  • 10
  • 25
  • I'd recommend checking out this SO post on windows encoding, specifically over stdout (your console output) http://stackoverflow.com/questions/1259084/what-encoding-code-page-is-cmd-exe-using. Also, make sure your default character encoding in php.ini is utf-8 http://stackoverflow.com/questions/9351694/setting-php-default-encoding-to-utf-8. – Robert Apr 12 '17 at 13:40
  • @Robert, thank you for your reply! Though, as mentioned in my question, under "Things I've Tried" I had thought of the possibility that cmd.exe may be at fault, and it is not. On the note of the default encoding, the default character encoding is indeed set to UTF-8, and all other entries related to character encoding are empty, and PHP falls back to the default encoding in those cases, which is UTF-8. – thephpdev Apr 12 '17 at 13:54

0 Answers0