0

I am working on a project where I have a C++ application which is hosting a REST API. The REST API receives HTTP requests from a library running on android. Various data gets posted to the REST API from the library, one of the items being posted is the device locale.

9 times out of 10 this is perfectly fine but occasionally there is a device where when the C++ parses the HTTP request and then tries to store the locale, the SQL fails.

The Locale that I am receiving in the HTTP request is %D8%A7%D9%84%D8%B9%D8%B1%D8%A8%D9%8A%D8%A9. When I URL Decode this it comes back as العربية

That's obviously my problem as its not a normal ASCII string which I'm expecting but I'm not sure the best way to get round it.

Within the C++ app I url decode it using the following:

string HTTPRequest::url_decode(string text) {
        char h;
        ostringstream escaped;
        escaped.fill('0');

        for (auto i = text.begin(), n = text.end(); i != n; ++i) {
                string::value_type c = (*i);

                if (c == '%') {
                        if (i[1] && i[2]) {
                                h = from_hex(i[1]) << 4 | from_hex(i[2]);
                                escaped << h;
                                i += 2;
                        }
                }
                else if (c == '+') {
                        escaped << ' ';
                }
                else {
                        escaped << c;
                }
        }

        return escaped.str();
}

I then try storing it in the database by using a prepared statement as follows:

statement->setString(3, this->getLocale().c_str());

The error I am getting back from MySQL when I try and insert it is:

Incorrect string value: '\xD8\xA7\xD9\x84\xD8\xB9...' for column 'Locale' at row 1.

What's the best way to get round this, I was thinking of changing the locale column to a blob, from varchar(50) so it can store it as binary as I'm assuming varchar can only deal with ASCII characters.

Would this be the right away to do this, or is there a better way to fix this problem?

halfer
  • 19,824
  • 17
  • 99
  • 186
Boardy
  • 35,417
  • 104
  • 256
  • 447

2 Answers2

0

it seems the default encoding for MySQL is Latin1. You may want to check your configuration file, and update it accordingly.

Change MySQL default character set to UTF-8 in my.cnf?

Syl
  • 2,733
  • 2
  • 17
  • 20
  • That was the reason, wonder why MySQL defaults to latin1 so I would have thought UTF-8 would be wanted the most to avoid these type of issues – Boardy Feb 17 '18 at 16:44
0
  • Declare that your connection is CHARACTER SET utf8 (or utf8mb4). (I don't know the details on C#; perhaps there is something in http://localhost/rjweb/mysql/doc.php/charcoll .)

  • Declare the column in the table to be CHARACTER SET utf8 (or utf8mb4)

  • '%D8%A7%D9%84%D8%B9%D8%B1%D8%A8%D9%8A%D8%A9' into UNHEX('D8A7D984D8B9D8B1D8A8D98AD8A9') when inserting. That is, remove the % first. Also this would work just in SQL: UNHEX(REPLACE('%D8%A7%D9%84%D8%B9%D8%B1%D8%A8%D9%8A%D8%A9', '%', ''))

Rick James
  • 135,179
  • 13
  • 127
  • 222