3

Hello boys and girls.

I'm developing a terminal based client application which communicates over TCP/IP to server and sends and receives an arbitary number of raw bytes. Each byte represents a command which I need to parse to Java classes representing these commands, for further use.

My question how I should parse these bytes efficiently. I don't want to end up with bunch of nested ifs and switch-cases.

I have the data classes for these commands ready to go. I just need to figure out the proper way of doing the parsing.

Here's some sample specifications:

Byte stream can be for example in integers:[1,24,2,65,26,18,3,0,239,19,0,14,0,42,65,110,110,97,32,109,121,121,106,228,42,15,20,5,149,45,87]

First byte is 0x01 which is start of header containing only one byte.

Second one is the length which is the number of bytes in certain commands, only one byte here also.

The next can be any command where the first byte is the command, 0x02 in this case, and it follows n number of bytes which are included in the command.

So on. In the end there are checksum related bytes.

Sample class representing the set_cursor command:

/**
 * Sets the cursor position.
 * Syntax: 0x0E | position
 */
public class SET_CURSOR {

private final int hexCommand = 0x0e;
private int position;

public SET_CURSOR(int position) {

}

public int getPosition() {
    return position;
}

public int getHexCommnad() {
    return hexCommand;
}

}
Henrique Barcelos
  • 7,670
  • 1
  • 41
  • 66
t-my
  • 113
  • 1
  • 2
  • 10

4 Answers4

4

When parsing byte streams like this the best Design Pattern to use is the Command Pattern. Each of the different Commands will act as handlers to process the next several bytes in the stream.

interface Command{

    //depending on your situation, 
    //either use InputStream if you don't know
    //how many bytes each Command will use
    // or the the commands will use an unknown number of bytes
    //or a large number of bytes that performance
    //would be affected by copying everything.
    void execute(InputStream in);

    //or you can use an array if the
    //if the number of bytes is known and small.
    void execute( byte[] data);

}

Then you can have a map containing each Command object for each of the byte "opcodes".

Map<Byte, Command> commands = ...

commands.put(Byte.parseByte("0x0e", 16), new SetCursorCommand() );
...

Then you can parse the message and act on the Commands:

InputStream in = ... //our byte array as inputstream
byte header = (byte)in.read();
int length = in.read();
byte commandKey = (byte)in.read();   
byte[] data = new byte[length]
in.read(data);

Command command = commands.get(commandKey);
command.execute(data);

Can you have multiple Commands in the same byte message? If so you could then easily wrap the Command fetching and parsing in a loop until the EOF.

dkatzel
  • 31,188
  • 3
  • 63
  • 67
2

you can try JBBP library for that https://github.com/raydac/java-binary-block-parser

@Bin class Parsed { byte header; byte command; byte [] data; int checksum;}
Parsed parsed = JBBPParser.prepare("byte header; ubyte len; byte command; byte [len] data; int checksum;").parse(theArray).mapTo(Parsed.class);
Igor Maznitsa
  • 833
  • 7
  • 12
1

This is a huge and complex subject.

It depends on the type of the data that you will read.

  • Is it a looooong stream ?
  • Is it a lot of small independent structures/objects ?
  • Do you have some references between structures/objects of your flow ?

I recently wrote a byte serialization/deserialization library for a proprietary software.

I took a visitor-like approach with type conversion, the same way JAXB works.

I define my object as a Java class. Initialize the parser on the class, and then pass it the bytes to unserialize or the Java object to serialize.

The type detection (based on the first byte of your flow) is done forward with a simple case matching mechanism (1 => ClassA, 15 => ClassF, ...).

EDIT: It may be complex or overloaded with code (embedding objects) but keep in mind that nowadays, java optimize this well, it keeps code clear and understandable.

superbob
  • 1,628
  • 13
  • 24
  • The stream is quite short. From few bytes to couple hundred max. The protocol defines 30 different actions, so a rather small amount. There are no linking between these objects so it should be pretty straightforward to parse these. They can be in arbitary order. This visitor-like approach sounds legitimate. Do you have any implementations which where I could study this pattern? – t-my Sep 13 '13 at 12:34
  • I'm also little concerned of the final performance beacause this library (which I'm developing here) is used in mobile context. In Dalvik JVM to be accurate. Is there simply too much overhead when I'm creating dozens of objects which consists only simple primitive data? – t-my Sep 13 '13 at 12:37
  • 2
    Unfortunately I cannot publish the source code because its closed source. But I can try to explain you the basic principle of what I did : 1) Create a Java class for each of your commands types. 2) Read your first byte, then read the second (size) 3) Read the whole buffer (one command) with the given size 4) With a case, match your type with the corresponding class 5) With a factory initialize a parser for this particular class (use a cache if possible to ease reuse) (This is the trickier part) 6) Unserialize ! – superbob Sep 13 '13 at 12:48
  • 1
    The key part is the java class that defines the commands. In my case its a POJO with simple types. The parser factory does some reflection on the class to detect its properties and match each different property type with a dedicated type parser (int/long/String). I use a visitor like logic to enable types nesting (sub classes) – superbob Sep 13 '13 at 12:54
0

ByteBuffer can be used for parsing byte stream - What is the use of ByteBuffer in Java?:

byte[] bytesArray = {4, 2, 6, 5, 3, 2, 1};
ByteBuffer bb = ByteBuffer.wrap(bytesArray);
int intFromBB = bb.order(ByteOrder.LITTLE_ENDIAN).getInt(); 
byte byteFromBB = bb.get(); 
short shortFromBB = bb.getShort(); 
Community
  • 1
  • 1
Justinas Jakavonis
  • 8,220
  • 10
  • 69
  • 114