1

I am passing a string from Javascript to a React Native native Java module and then back to Javascript. However, any high Unicode characters such as emojis become corrupted after passing it to Java and turn into a pair of question marks.

For example, the string "testing123" becomes "testing123??"

How can I fix this so that the characters retain their values?

EDIT: The string is being processed by a React Native background upload library. Here is an excerpt of the code from that library that passes the text (which is in the parameters field) to the Java module:

import { NativeModules, DeviceEventEmitter } from 'react-native'
export type StartUploadArgs = {
  url: string,
  path: string,
  method?: 'PUT' | 'POST',
  // Optional, because raw is default
  type?: 'raw' | 'multipart',
  // This option is needed for multipart type
  field?: string,
  customUploadId?: string,
  // parameters are supported only in multipart type
  parameters?: { [string]: string },
  headers?: Object,
  notification?: NotificationArgs
}
const NativeModule = NativeModules.VydiaRNFileUploader || NativeModules.RNFileUploader // iOS is VydiaRNFileUploader and Android is NativeModules
//...
export const startUpload = (options: StartUploadArgs): Promise<string> => NativeModule.startUpload(options)

And here is an excerpt of the Java code that handles the string:

  @ReactMethod
  public void startUpload(ReadableMap options, final Promise promise) {
//...
      HttpUploadRequest<?> request;

      if (requestType.equals("raw")) {
        request = new BinaryUploadRequest(this.getReactApplicationContext(), customUploadId, url)
                .setFileToUpload(filePath);
      } else {
        if (!options.hasKey("field")) {
          promise.reject(new IllegalArgumentException("field is required field for multipart type."));
          return;
        }

        if (options.getType("field") != ReadableType.String) {
          promise.reject(new IllegalArgumentException("field must be string."));
          return;
        }

        request = new MultipartUploadRequest(this.getReactApplicationContext(), customUploadId, url)
                .addFileToUpload(filePath, options.getString("field"));
      }


      request.setMethod(method)
        .setMaxRetries(2)
        .setDelegate(statusDelegate);
//...
      if (options.hasKey("parameters")) {
        if (requestType.equals("raw")) {
          promise.reject(new IllegalArgumentException("Parameters supported only in multipart type"));
          return;
        }

        ReadableMap parameters = options.getMap("parameters");
        ReadableMapKeySetIterator keys = parameters.keySetIterator();

        while (keys.hasNextKey()) {
          String key = keys.nextKey();

          if (parameters.getType(key) != ReadableType.String) {
            promise.reject(new IllegalArgumentException("Parameters must be string key/values. Value was invalid for '" + key + "'"));
            return;
          }
          request.addParameter(key, parameters.getString(key));
        }
      }
//...
      String uploadId = request.startUpload();
      promise.resolve(uploadId);
  }
Eli
  • 21
  • 1
  • 4
  • 2
    Please share your Java code that is ingesting this string. It's like an encoding issue there. – user2254180 Jul 31 '19 at 14:21
  • You need to escape unicode character before passing to Java or any other platform for that matter. Refer to [this](https://gist.github.com/mathiasbynens/1243213) method. – tarzen chugh Jul 31 '19 at 14:23
  • Start by showing the code that sends the string from JavaScript to java - you're probably doing something wrong with encoding there. Alternatively but less likely, you're doing something wrong in the receiving side. – Erwin Bolwidt Jul 31 '19 at 14:24
  • I've updated the question with the JS and Java code that handles the string. – Eli Jul 31 '19 at 15:20

2 Answers2

1

The java servlet specification assumes form params are ISO-8859-1 by default. Assuming you are using tomcat see https://cwiki.apache.org/confluence/display/TOMCAT/Character+Encoding for info on how to resolve this issue

Relevant quote from the page

POST requests should specify the encoding of the parameters and values they send. Since many clients fail to set an explicit encoding, the default used is US-ASCII for application/x-www-form-urlencoded and ISO-8859-1 for all other content types.

Related SO post https://stackoverflow.com/a/19409520/1967484

Keep in mind its also possible for your console and your database to also not support high unicode characters.

Deadron
  • 5,135
  • 1
  • 16
  • 27
  • I am not using a Java servlet in my application. The Java side sends the string to a Node server, but it seems like the text gets corrupted before being sent. – Eli Jul 31 '19 at 18:16
  • 2
    Your ?? problem is clear sign of a encoding issue. Making sure you set the charset on every http request made in the process and it will probably fix your problem – Deadron Jul 31 '19 at 18:42
  • This was the issue. The background upload library I'm using wasn't exposing a method to set the charset to UTF-8, but once I set it manually it worked properly. Thanks for the help. – Eli Aug 01 '19 at 13:15
1

Modifying the background upload library's code like this fixed the issue:

request = new MultipartUploadRequest(this.getReactApplicationContext(), customUploadId, url)
        .addFileToUpload(filePath, options.getString("field"))
        .setUtf8Charset(); // add this line
Eli
  • 21
  • 1
  • 4