0

When I submit a post with AJAX that is not in English, I will get something similar to %u4F60%u662F%u5982%u4F55%u505A%uFF1F What can I do to fix this in PHP? I've tried utf8_decode and doesn't work.

I'm submitting the text with AJAX if that helps.

Kara
  • 6,115
  • 16
  • 50
  • 57
Jake
  • 1,469
  • 4
  • 19
  • 40

1 Answers1

2

Does this do what you want?

<?php

  function utf8_urldecode ($str) {
    return urldecode(preg_replace_callback('/%u([0-9A-F]{4,6})/i', function($matches) {
      $int = hexdec($matches[1]);
      switch (TRUE) {
        case $int < 0x80:
          return pack('C*', $int & 0x7F);
        case $int < 0x0800:
          return pack('C*', (($int & 0x07C0) >> 6) | 0xC0, ($int & 0x3F) | 0x80);
        case $int < 0x010000:
          return pack('C*', (($int & 0xF000) >> 12) | 0xE0, (($int & 0x0FC0) >> 6) | 0x80, ($int & 0x3F) | 0x80);
        case $int < 0x110000:
          return pack('C*', (($int & 0x1C0000) >> 18) | 0xF0, (($int & 0x03F000) >> 12) | 0x80, (($int & 0x0FC0) >> 6) | 0x80, ($int & 0x3F) | 0x80);
        default:
          return $matches[0];
      }
    }, $str));
  }

  $str = "%u4F60%u662F%u5982%u4F55%u505A%uFF1F";

  echo utf8_urldecode($str);

I have never tried to convert hex UTF-8 code points to binary before, turns out it's actually quite easy when you get your head around it. Of course, it may still display as nonsense in your browser, depending on what the characters actually are - you may need install a language pack for them to render correctly.

DaveRandom
  • 87,921
  • 11
  • 154
  • 174