Background: I'm writing some code which serializes and deserializes a binary protocol across a network, and I'm writing C#, Swift and Java client code.
We get a packet off the network (Byte array), deserialize it into a tree of structures, and then traverse the tree looking for stuff.
In C#, I made use of the Span<T>
struct which gives you a "view" onto an existing memory buffer. Likewise in Swift, I used the Data
struct which can also act like a "view".
In C++ I'd probably use some similar concept.
Essentially something like:
raw buffer: R_T1aaa_T2bbb_T3ccc
.
parses as Root -> [Tag1[aaa],Tag2[bbb], Tag3[ccc]]
In the above model, the Root,Tag1,Tag2,Tag3 objects are structs allocated on the stack. The tag contents (aaa, bbb, ccc) are just pointers into the underlying raw buffer and we haven't allocated or copied any heap memory for them at all.
In Java, I had begun by creating a class like this (@MarkRotteveel kindly points out that I could use ByteBuffer rather than making my own wrapper class):
class ByteArrayView {
@NonNull final byte[] owner; // shared
final int offset;
final int length;
}
I was going to proceed with the same pattern as used in C#, Swift, however the thought occurred to me - I'm using array views to avoid heap allocation, however everything in Java is a heap allocation. Instead of copying some byte[]
values around the place, I allocate ByteArrayView
objects, but it's heap-allocations nonetheless.
I'm aware that if I had large "slices" then something like my ByteArrayView
or java.nio.ByteBuffer
would reduce the total amount of memory that my program allocations, but for the most part my tags are small - e.g. 4 bytes, 12 bytes, 32 bytes.
Does trading small byte[]
objects for wrapper objects have any meaningful result if they're all still on the heap? Does it actually make things worse because there's an additional indirection before we get to the actual bytes that we want to read? Is it worth the extra code complexity?
What would experienced Java developers do here?