If the input is a String gson seems to return the String as is. It doesn't do any validation of the input. Is this expected?
Yes, this is fine. It just returns a JSON string representation of the given string.
I'd like to use Gson in a way that it would validate that the input object is actually Json. How could I do this?
No need per se. Gson.toJson()
method accepts objects to be serialized and it generates valid JSON always. If you mean deserialization, then Gson makes fast fails on invalid JSON documents during reading/parsing/deserialization (actually reading, this is the bottom-most layer of Gson).
I'm gonna be invoking this serialize function several thousands of times over a short period. Converting to String and then to byte[] could be some unwanted overhead. Is there a more optimal way to get the byte[]?
Yes, accumulating a JSON string to in order just to expose its internal char[]
clone is memory waste, of course. Gson is basically a stream-oriented tool, and note that there are Gson.toJson
method overloads accepting Appendable
that are basically the Gson core (just take a quick look on how Gson.fromJson(Object)
works -- it just creates a StringWriter
instance to accumulate a string because of the Appendable
interface). It would be extremely cool if Gson could emit JSON tokens via a Reader
rather than writing to an Appendable
, but this idea was refused and most likely will never be implemented in Gson, unfortunately. Since Gson does not emit JSON tokens during deserialization in read semantics manner (from your code perspective), you have to buffer the whole result:
private static byte[] serializeToBytes(final Object object)
throws IOException {
final ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
final OutputStreamWriter writer = new OutputStreamWriter(outputStream);
gson.toJson(object, writer);
writer.flush();
return outputStream.toByteArray();
}
This one does not use StringWriter
thus not accumulating an intermediate string with cloned arrays ping-pong. I don't know if there are writers/output streams that can utilize/re-use existing byte arrays, but I believe there should be some, because it makes a good rationale for the performance purposes you mentioned in your question.
If possible, you can also check your library interface/API for exposing/accepting OutputStream
s somehow -- then you could probably easily pass such output streams to the serializeToBytes
method or even remove the method. If it can use input streams, not just byte arrays, you could also take a look at converting output streams to input streams so the serializeToBytes
method could return an InputStream
or a Reader
(requires some overhead, but can process infinite data -- need to find the balance):
private static InputStream serializeToByteStream(final Object object)
throws IOException {
final PipedInputStream inputStream = new PipedInputStream();
final OutputStream outputStream = new PipedOutputStream(inputStream);
new Thread(() -> {
try {
final OutputStreamWriter writer = new OutputStreamWriter(outputStream);
gson.toJson(object, writer);
writer.flush();
} catch ( final IOException ex ) {
throw new RuntimeException(ex);
} finally {
try {
outputStream.close();
} catch ( final IOException ex ) {
throw new RuntimeException(ex);
}
}
}).start();
return inputStream;
}
Example of use:
final String value = "foo";
System.out.println(Arrays.toString(serializeToBytes(value)));
try ( final InputStream inputStream = serializeToByteStream(value) ) {
int b;
while ( (b = inputStream.read()) != -1 ) {
System.out.print(b);
System.out.print(' ');
}
System.out.println();
}
Output:
[34, 102, 111, 111, 34]
34 102 111 111 34
Both represent an array of ASCII codes representing a string "foo"
literally.