0

I am attempting to use Gson to to take some Java Object and serialize that to json and get a byte array that represents that Json. I need a byte array because I am passing on the output to an external dependency that requires it to be a byte array.

public byte[] serialize(Object object){
  return gson.toJson(object).getBytes();
}

I have 2 questions:

  1. If the input is a String gson seems to return the String as is. It doesn't do any validation of the input. Is this expected? I'd like to use Gson in a way that it would validate that the input object is actually Json. How could I do this?
  2. I'm gonna be invoking this serialize function several thousands of times over a short period. Converting to String and then to byte[] could be some unwanted overhead. Is there a more optimal way to get the byte[]?
RagHaven
  • 4,156
  • 21
  • 72
  • 113
  • _I'd like to use Gson in a way that it would validate that the input object is actually Json._ -- You're asking about serialization, but now it sounds like you're going to deserialize. Does passing JSONs to `Gson.toJson` make any sense? – Lyubomyr Shaydariv Mar 30 '17 at 17:45
  • Well, I guess I need to both serialization and deserialization if I want to validate, right? – RagHaven Mar 30 '17 at 17:53
  • Well, I guess I need to both serialization and deserialization if I want to validate, right? – RagHaven Mar 30 '17 at 17:53
  • I mean that your question #1 remains unclear: it sounds like you're asking it in the context of the code snippet you've provided having no mentions on deserialization. Could you please elaborate? – Lyubomyr Shaydariv Mar 30 '17 at 17:55
  • I was using `Gson.toJson` in the hope that it would validate whether the object can actually be represented as Json. That was the intention. I don't want to deserialize. I simply want to validate and then serialize. On reading the `Gson` docs I thought `Gson.toJson` would provide the ability to validate. It happens to return a String, so getting the byte[] could be accomplished by getBytes(). Is it more clear now? I guess I should have added more of an explanation. – RagHaven Mar 30 '17 at 18:38
  • But, `Gson.toJson` doesnt seem to validate if a String input is actually Json. Also, I'm not sure if this approach is the most performant way to validate and then serialize. – RagHaven Mar 30 '17 at 18:39

2 Answers2

0

edit: my answer on point 1 was misinformed.

2) There will be a lot of unnecessary overhead in reflection if you just use the vanilla gson converter. It would very much be a performance benefit in your case to write a custom adapter. here is one article with more info on that https://open.blogs.nytimes.com/2016/02/11/improving-startup-time-in-the-nytimes-android-app/?_r=0

Canoe
  • 589
  • 5
  • 8
0

If the input is a String gson seems to return the String as is. It doesn't do any validation of the input. Is this expected?

Yes, this is fine. It just returns a JSON string representation of the given string.

I'd like to use Gson in a way that it would validate that the input object is actually Json. How could I do this?

No need per se. Gson.toJson() method accepts objects to be serialized and it generates valid JSON always. If you mean deserialization, then Gson makes fast fails on invalid JSON documents during reading/parsing/deserialization (actually reading, this is the bottom-most layer of Gson).

I'm gonna be invoking this serialize function several thousands of times over a short period. Converting to String and then to byte[] could be some unwanted overhead. Is there a more optimal way to get the byte[]?

Yes, accumulating a JSON string to in order just to expose its internal char[] clone is memory waste, of course. Gson is basically a stream-oriented tool, and note that there are Gson.toJson method overloads accepting Appendable that are basically the Gson core (just take a quick look on how Gson.fromJson(Object) works -- it just creates a StringWriter instance to accumulate a string because of the Appendable interface). It would be extremely cool if Gson could emit JSON tokens via a Reader rather than writing to an Appendable, but this idea was refused and most likely will never be implemented in Gson, unfortunately. Since Gson does not emit JSON tokens during deserialization in read semantics manner (from your code perspective), you have to buffer the whole result:

private static byte[] serializeToBytes(final Object object)
        throws IOException {
    final ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
    final OutputStreamWriter writer = new OutputStreamWriter(outputStream);
    gson.toJson(object, writer);
    writer.flush();
    return outputStream.toByteArray();
}

This one does not use StringWriter thus not accumulating an intermediate string with cloned arrays ping-pong. I don't know if there are writers/output streams that can utilize/re-use existing byte arrays, but I believe there should be some, because it makes a good rationale for the performance purposes you mentioned in your question.

If possible, you can also check your library interface/API for exposing/accepting OutputStreams somehow -- then you could probably easily pass such output streams to the serializeToBytes method or even remove the method. If it can use input streams, not just byte arrays, you could also take a look at converting output streams to input streams so the serializeToBytes method could return an InputStream or a Reader (requires some overhead, but can process infinite data -- need to find the balance):

private static InputStream serializeToByteStream(final Object object)
        throws IOException {
    final PipedInputStream inputStream = new PipedInputStream();
    final OutputStream outputStream = new PipedOutputStream(inputStream);
    new Thread(() -> {
        try {
            final OutputStreamWriter writer = new OutputStreamWriter(outputStream);
            gson.toJson(object, writer);
            writer.flush();
        } catch ( final IOException ex ) {
            throw new RuntimeException(ex);
        } finally {
            try {
                outputStream.close();
            } catch ( final IOException ex ) {
                throw new RuntimeException(ex);
            }
        }
    }).start();
    return inputStream;
}

Example of use:

final String value = "foo";
System.out.println(Arrays.toString(serializeToBytes(value)));
try ( final InputStream inputStream = serializeToByteStream(value) ) {
    int b;
    while ( (b = inputStream.read()) != -1 ) {
        System.out.print(b);
        System.out.print(' ');
    }
    System.out.println();
}

Output:

[34, 102, 111, 111, 34]
34 102 111 111 34

Both represent an array of ASCII codes representing a string "foo" literally.

Community
  • 1
  • 1
Lyubomyr Shaydariv
  • 20,327
  • 12
  • 64
  • 105