0

How can I store emoji from TextMesh pro in a database using the WWW class to post the string containing the emoji as a parameter in the URL?

I'm having a problem getting the emoji from a TextMesh Pro text field into my database. When I try, the emoji data is stored in plain text like this: 😂 or like this � or like this □ depending on which encoding i try.

I have a php script that uses a sql statement to store the text in my mysql database. And when I manually type the url of the php script in my browser and add the emoji as a parameter it works perfectly fine, it properly stores the emoji (as I have already set the collation of my database to utf8mb4).

Here's the part I don't understand: if I take the string that contains the emoji through my c# code and access the php file with the string as a parameter, it doesn't work. It stores the emojis as mojibake. (😂 this sequence of characters should be this: ).

It stores the text just fine otherwise, so there's no problem with the code below. Here's how it looks:

public IEnumerator UpdateChatCR(string gameID, string text)
{
    string hash = Md5Sum(gameID + secretKey);

    print("posting to chat db: " + text);

    string post_url = updateChatURL + "&game_id=" + gameID + "&text=" + text + "&hash=" + hash;
    WWW www = new WWW("http://" + post_url);
    yield return www;

    if (www.error != null)
    {
        print("There was an error updating the chat in the DB: " + www.error);
    }
    else
    {
        //print(www.text);
    }
}


I have tried different ways of encoding the text, but to be honest I have no idea what I'm doing at this point. I tried different variations of the following code, but with no luck:

byte[] bytes = Encoding.UTF8.GetBytes(tmpField.text);
string encodedText = Encoding.Default.GetString(bytes); 


I've been searching for an answer for the last few days. And I've come to the point where I realize I probably have to read a book about encoding and emoji unless there's someone out there who can help me out...

Thank you for your time.

derHugo
  • 83,094
  • 9
  • 75
  • 115
Magnus
  • 3
  • 3
  • Could you clarify what the purpose of that strange code shown in the post (reading UTF8 bytes as some other encoding)? Not exactly sure how that is related to your question... – Alexei Levenkov May 22 '19 at 18:32
  • 1
    @Machavity that duplicate doesn't apply here as they've already stated the PHP and MySQL side of things work fine. Magnus, I'd suggest reworking your question to focus on C#. If PHP is working fine, no need to tag it in the question. Provide more C# code, including a [mcve]. – miken32 May 22 '19 at 18:36
  • 2
    That's an interesting edit - you claim that "So the problem is not in my MySQL database, nor in my PHP." yet you've removed C#... So can you clarify that exactly you have problem with? Consider [edit] your post to confirm that you can see whatever you wanted with hardcoded values and show data (as bytes preferably) and actual code you tried... – Alexei Levenkov May 22 '19 at 18:41
  • Removing c# tag was a mistake. I made another edit now. I'm not sure what code to include. Also I only really know very little about encoding, so it's hard for me to understand the problem when I don't really know what the problem is. I know that doesn't help... – Magnus May 22 '19 at 18:51
  • How are you putting the emoji into textmesh pro? – BugFinder May 22 '19 at 18:51
  • I'm getting the data from the php script using the same WWW class as above. – Magnus May 22 '19 at 18:59
  • I don't know C#, but there has got to be a proper way of building your `post_url` variable. You need to encode things specially to be used in URLs. – miken32 May 22 '19 at 18:59
  • Yes I also think it's an encoding issue, I'm just not sure how to go about it. I've tried various ways of transcoding, with no luck. I really have no idea what I'm doing. – Magnus May 22 '19 at 19:02
  • So if you put it into textmesh pro using the php script, why do you need to send it back to php? – BugFinder May 22 '19 at 20:02
  • If a user inputs a new message that has an emoji in it, I need to store that message and the emoji properly in my database. – Magnus May 22 '19 at 20:21
  • Have you tried using the same encoding (let's say, Encoding.UTF8)? Also, the reason for incorrect file content might be the editor doesn't support emoji. – SGKoishi May 22 '19 at 20:28
  • If it's not accepting the encoding of the emoji as is, it may be worth considering parsing the emoji into something else before saving it. For example, have a parser look for the smile emoji symbols it uses and replace them with ":)" for saving in the database. Then when reading from the database, have it replace the ":)" with the emoji. – Tim Hunter May 22 '19 at 20:33
  • @TimHunter That's definitely a good idea and I will consider doing that if I decide to give up on finding a solution. It's just strange because I have a php script that uses a sql statement to store the text in my mysql database. And when I access the url of the php script and add the emoji as a parameter to store the emoji as a text in my database it works fine, but if I do that through my c# code, it doesn't work. It stores the emojis as mojibake. – Magnus May 22 '19 at 21:03
  • The issue might be that WWW internally automatically does some URL enconding since you want to send this as a URL. URLs are restricted to a certain charset and e.g. `:` is reserved for defining a port and therefore has no place in an URL. Also `)` is not an allowed character in an URL. – derHugo May 23 '19 at 03:39
  • You should rather use a [`UnityWebRequest.Post`](https://docs.unity3d.com/ScriptReference/Networking.UnityWebRequest.Post.html) and add your data e.g. as a `WWWForm` and in php rather use `$_POST["text"]` instead of `GET`. – derHugo May 23 '19 at 03:42

1 Answers1

0

Sending UTF-8 encoded text as part of the URL is prone to error as the servers have wildly different results when decoding such URLs. The URL standard doesn't really cater for UTF-8 in URLS (see https://stackoverflow.com/a/1020299/511362) so you would be best off sending your text as a HTTP POST request.

  WWWForm form = new WWWForm();
 form.AddField("text", text);

  using (UnityWebRequest www = UnityWebRequest.Post("http://www.my-server.com/myform", form))
  {
     yield return www.SendWebRequest();
  }

Also you should rethink your security scheme here. You use a hash without a nonce here so everyone getting hold on that hash for your game-id will be able to post messages for that game to your server, because that hash is as good as a password for an attacker. If you would do the hash as MD5(gameId+text+secretKey) you would be a lot better off as an attacker could not send arbitrary messages with a stolen hash (he can still spam you with the same message though, but you get the idea).

You should also not put that hash into the URL where it will be visible in your HTTP server's access log files. Use the Authorization header for this. Finally you should really use HTTPS to secure the payload.

Jan Thomä
  • 13,296
  • 6
  • 55
  • 83
  • This is it. Everything works perfectly now. I knew it was something about the encoding in the url. I can't believe this problem I have been looking for an answer to actually just boils down to get vs post request. I should have paid more attention in my networking classess. Thank you so much! – Magnus May 23 '19 at 16:10