1

I retrieve an encoded string (using TextEncoder into UTF-8, which was stringified before sending to the server) from the server using AJAX. I parse it upon retrieval and get an Object. I need to convert this Object to a decoded string. TextDecoder seems to have decode method, but it expects ArrayBuffer or ArrayBufferView, not Object. That method gives TypeError if I use my Object as-is:

var myStr = "This is a string, possibly with utf-8 or utf-16 chars.";
console.log("Original: " + myStr);

var encoded = new TextEncoder("UTF-16").encode(myStr);
console.log("Encoded: " + encoded);

var encStr = JSON.stringify(encoded);
console.log("Stringfied: " + encStr);

//---------- Send it to the server; store in db; retrieve it later ---------

var parsedObj = JSON.parse(encStr); // Returns an "Object"
console.log("Parsed: " + parsedObj);

// The following decode method expects ArrayBuffer or ArrayBufferView only
var decStr = new TextDecoder("UTF-16").decode(parsedObj); // TypeError

// Do something with the decoded string

This SO 6965107 has extensive discussion on converting strings/ArrayBuffers but none of those answers work for my situation. I also came across this article, which does not work if I have Object.

Some posts suggest to use "responseType: arraybuffer" which results in ArrayBuffer response from the server, but I cannot use it when retrieving this encoded string because there are many other items in the same result data which need different content-type.

I am kind of stuck and unable to find a solution after searching for a day on google and SO. I am open to any solution that lets me save "strings containing international characters" to the server and "retrieve them exactly as they were", except changing the content-type because these strings are bundled within JSON objects that carry audio, video, and files. Any help or suggestions are highly appreciated.

Bergi
  • 630,263
  • 148
  • 957
  • 1,375
BReddy
  • 407
  • 3
  • 13
  • `JSON.stringify(encoded)` is really weird (and causing problems because it treats the buffer as an object not as an array). Why are you doing that? Send the ArrayBuffer, or at least a JSON *array*, to the server, which will make everything much easier. – Bergi Aug 14 '20 at 21:04
  • "*save arbitrary strings to the server and retrieve them back*" is usually no big thing, assuming the server accepts the string encoding you use (utf8) and can store the values internally appropriately. You should not need a `TextEncoder` for that. Can you please show your code that does send your content? Just putting strings in JSON should work. If it doesn't, chances are high that the server side needs fixing. – Bergi Aug 14 '20 at 21:07
  • @Bergi, Thanks for the edit and comments. I tried to send the strings with international characters in them within JSON object, but with `contentType:application/json`, server-side is unable to parse the JSON object when there are unencoded strings. Only way I found to bundle international character codes with other data in the same JSON object is to encode the string into utf-8. Changing the content type is not an option because of the other items. Even when omitting content type, the data is automatically encoded by AJAX using the default content type for the page. – BReddy Aug 15 '20 at 20:59
  • 1
    Yes, the whole JSON text should be utf-8 encoded, and the `Content-Type` header should have the value `application/json; charset=UTF-8`. Make sure your server can correctly deal with that. – Bergi Aug 15 '20 at 21:03
  • @Bergi, my code to send to the server is: `$.ajax ({ type: 'POST', url: myUrl, data: {'item1' : 'str1', 'item2': 'str2', ..., 'internationalChars': encStr}, dataType: 'json', crossDomain: true, contentType:'application/json', beforeSend: ..., error: ..., success: ...});` – BReddy Aug 15 '20 at 21:08
  • @Bergi, thanks for the suggestion to uniformly encode the entire post data but not all items can be encoded into utf-8 because I am using the same JSON object to carry audio/video streams, object streams, etc. Each parameter in the POST has its own requirements and can't be uniform type of encoding. – BReddy Aug 15 '20 at 21:12
  • Huh? How do you encode a stream into JSON text? – Bergi Aug 15 '20 at 21:13
  • The number of parameters in this POST is rather large. We use various techniques for each parameter separately depending on the data in that parameter including plain text, BLOB, encoded strings (some don't need to be retrieved back by the clients, so this problem does not arise for them), and encoded strings (that need to be stored and retrieved for later display). The last item is the problem for us. – BReddy Aug 15 '20 at 21:23
  • Please post the code for how you put blobs into json. Also what's an "encoded string", how is that different from a plaintext string? And could you maybe also include the serverside code that is handling this request and decoding the json? – Bergi Aug 15 '20 at 21:27
  • [1] (canvas.toBlob(), File, and audio/video chunks from mediaRecorder - all become Blobs)->arrayBuffer (base64 data) ==> use MIME::Base64 (perl on server-side, with some logic to handle peculiarity of 'image', 'audio', 'video', 'file' types) - All these types go from client to the server without any modifications to them [2] English text ==> goes to server as strings [3] Non-English ==> goes to server in utf-8 encoded strings They go to the database. Perl and DB have requirements to encode (Chinese/Japanese/Korean/Unicode characters) or they fail. Actual code is too complex to post here. – BReddy Aug 16 '20 at 01:41
  • Our best option is to encode/decode these strings rather than changing contentType, server-side logic, database settings, etc., because those changes require concurrence from different teams and architects. Converting these strings to acceptable form to them and reconverting those strings back to displayable form is all on the client side and is in my hands. I thought I could do that using TextEncoder/TextDecoder, but seems more complex than meets the eye. – BReddy Aug 16 '20 at 01:46
  • @Bergi, Our best option is to encode/decode these strings rather than changing contentType, server-side logic, database settings, etc., because those changes require concurrence from different teams and architects. Converting these strings to acceptable form to them and reconverting those strings back to displayable form is all on the client side and is in my hands. I thought I could do that using TextEncoder/TextDecoder. If a person with your reputation is finding it not easily handled, either my approach is wrong or this really is a complex issue. – BReddy Aug 16 '20 at 01:52
  • Why even distinguish between "English" and "Non-English" input? It's all just text. I don't see any reason not to follow [standard best practices](http://utf8everywhere.org/). Yes, you should be using Unicode, and you should use utf-8 as the content type for your ajax request. It will transport the base64-encoded binary data just fine, no issues with that. And yes, your server and database also should use utf-8 encoding for text. Storing data that you cannot process is not worth the effort. – Bergi Aug 16 '20 at 02:37
  • But if your team lets you down and you have to transcode text on the client side because the server cannot process unicode properly, then just [convert those array buffers to base64](https://stackoverflow.com/questions/9267899/arraybuffer-to-base64-encoded-string) like you did for the other binary data. Do not `JSON.stringify` them directly. – Bergi Aug 16 '20 at 02:40
  • @Bergi, I might have to do that instead of trying to use TextEncoder/TextDecoder and JSON.stringify. Thanks for the input. – BReddy Aug 16 '20 at 13:45
  • I meant the arraybuffer that the `TextEncoder` returns - you still need that to transform the input to binary data if you cannot send it as normal text. – Bergi Aug 16 '20 at 13:50
  • Ok, I see now. I have to experiment with this idea little more. Thanks. – BReddy Aug 16 '20 at 14:01

0 Answers0