0

I need to write a client that communicates with server using socket. the message protocol is in json format. the server will push multiple json blocks to client on demand. The message is like this:

{"a": 1, "b": { "c": 1}}{"a": 1, "b": { "c": 1}}... 

You can see that there is no separator or identifier between the json blocks.

The json parsers I can find (like fastjson, jackson) are all only able to deal the stream as a whole json block, even the stream api they provide. when I use these api to parse the stream, they will throw an exception at the end of first json block, said the next token "{" is invalid.

Is there a json parser in java can deal with my problem? or is there other way to solve this problem?

Terry Ma
  • 369
  • 1
  • 7
  • Possible duplicate: http://stackoverflow.com/questions/13384454/split-json-objects-using-a-regular-expression – brunobastosg Mar 14 '15 at 02:41
  • I checked the answer in above link. the situation a little different. the server provide the json by stream, so i can't use regex easily. – Terry Ma Mar 14 '15 at 02:46
  • Can't you convert the stream to a string and then use a regex? – brunobastosg Mar 14 '15 at 02:52
  • So the result is like an array, except it's not an array for any JSON parser. Do you have nested blocks of JSON? I mean, can you asume that `}{` always separates one element from the following one? – fps Mar 14 '15 at 03:51
  • @Magnamag yes i have nested blocks. so I can't just separate block by "}{" and also there may be "{","}" in field value. – Terry Ma Mar 14 '15 at 05:24

2 Answers2

0

You can do it with Genson. First configure it to allow "permissive" parsing and then deserialize the values in an iterator. Genson will parse the objects one by one as you call next on the iterator. Thus you can parse this way very large inputs.

Genson genson = new GensonBuilder().usePermissiveParsing(true).create();
ObjectReader reader = genson.createReader(inputStream);
Iterator<SomeObject> iterator = genson.deserializeValues(reader, GenericType.of(SomeObject.class));

This part of the API is a bit verbose as the use case is not so common.

UPDATE In Genson 1.4 usePermissiveParsing has been removed in favor of accepting root values not wrapped in an array by default. See https://github.com/owlike/genson/issues/78

eugen
  • 5,856
  • 2
  • 29
  • 26
  • Dammed if I can find this usePermissiveParsing method on the GensonBuilder – richard Jun 10 '16 at 04:24
  • Starting with genson 1.4 usePermissiveParsing has been removed, by default it should work. See https://github.com/owlike/genson/issues/78 – eugen Jun 10 '16 at 20:04
0

Finally it seams no JSON parser in java suit for my situation. I am using netty to build a network application. for nio, when there is data coming from network, the decode method in ByteToMessageDecoder is called. In this method I need to find out JSON block from ByteBuf.

Since there is no available JSON parser, I wrote a method to split the JSON Block from ByteBuf.

    public static void extractJsonBlocks(ByteBuf buf, List<Object> out) throws UnsupportedEncodingException {
    // the total bytes that can read from ByteBuf
    int readable = buf.readableBytes();
    int bracketDepth = 0;
    // when found a json block, this value will be set
    int offset = 0;
    // whether current character is in a string value
    boolean inStr = false;
    // a temporary bytes buf for store json block
    byte[] data = new byte[readable];
    // loop all the coming data
    for (int i = 0; i < readable; i++) {
        // read from ByteBuf
        byte b = buf.readByte();
        // put it in the buffer, be care of the offset
        data[i - offset] = b;
        if (b == SYM_L_BRACKET && !inStr) {
            // if it a left bracket and not in a string value
            bracketDepth++;
        } else if (b == SYM_R_BRACKET && !inStr) {
            // if it a right bracket and not in a string value
            if (bracketDepth == 1) {
                // if current bracket depth is 1, means found a whole json block
                out.add(new String(data, "utf-8").trim());
                // create a new buffer
                data = new byte[readable - offset];
                // update the offset
                offset = i;
                // reset the bracket depth
                bracketDepth = 0;
            } else {
                bracketDepth--;
            }
        } else if (b == SYM_QUOTE) {
            // when find a quote, we need see whether preview character is escape.
            byte prev = i == 0 ? 0 : data[i - 1 - offset];
            if (prev != SYM_ESCAPE) {
                inStr = !inStr;
            }
        }

    }
    // finally there may still be some data left in the ByteBuf, that can not form a json block, they should be used to combine with the following datas
    // so we need to reset the reader index to the first byte of the left data
    // and discard the data used for json blocks
    buf.readerIndex(offset == 0 ? offset : offset + 1);
    buf.discardReadBytes();
}

maybe this is a not perfect parser, but it works well for my application now.

Terry Ma
  • 369
  • 1
  • 7