9

Background

A bit input stream is backed by an array of bytes. There are a handful of methods that read from that byte array into various coerced primitive arrays.

Problem

There is duplicated code. Java lacks generics on primitive types, so perhaps the repetition is unavoidable.

Code

The repetitious code is apparent in the following methods:

@Override
public long readBytes(final byte[] out, final int offset, final int count, final int bits) {
    final int total = offset + count;

    assert out != null;
    assert total <= out.length;

    final long startPosition = position();

    for (int i = offset; i < total; i++) {
        out[i] = readByte(bits);
    }

    return position() - startPosition;
}

@Override
public long readShorts(final short[] out, final int offset, final int count, final int bits) {
    final int total = offset + count;

    assert out != null;
    assert total <= out.length;

    final long startPosition = position();

    for (int i = offset; i < total; i++) {
        out[i] = readShort(bits);
    }

    return position() - startPosition;
}

Note how final byte[] out relates to readByte(bits) just as final short[] out relates to readShort(bits). These relations are the crux of the problem.

Question

How can the duplication be eliminated, if at all, without incurring a significant performance hit (e.g., by autoboxing)?

Related

Dave Jarvis
  • 30,436
  • 41
  • 178
  • 315
  • 6
    Nope, nothing you can do there. Duplication is the only option. – Andy Turner Feb 27 '20 at 19:19
  • Use a third party primitive collection – Vince Feb 27 '20 at 19:20
  • 1
    `Java lacks generics on primitive types, so perhaps the repetition is unavoidable.` Yup. (Usually it isn't much a problem, since it's rare for one program to need more than a few different primitives. You could also "fix" this by putting primitives inside a class and using object serialization, although that can be relatively slow.) – markspace Feb 27 '20 at 19:22
  • @markspace and even if you need all primitives, there are only 8 of them. Code generation may be an option. – Andy Turner Feb 27 '20 at 19:23
  • 3
    Also, (just remembered this) if you're reading bulk primitives like your code seems to indicate, using `ByteBuffer` methods like `asDoubleBuffer()` or `asShortBuffer()` will offload some of the lowest level work. https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/nio/ByteBuffer.html – markspace Feb 27 '20 at 19:25
  • @Andy Turner And that is what exactly happened with `String.valueOf`, the overloaded method just was rewritten 9 times for different cases, just as an example (I know not each implementation is for a primitive). – Nexevis Feb 27 '20 at 19:35
  • 1
    Note that there are some efforts to bring primitive generic support to Java, i.e. `List` etc. Release in maybe 2-5 years or so. It is called Project Valhalla. – Zabuzard Mar 04 '20 at 20:29

2 Answers2

2

If you're reading bulk primitives like your code seems to indicate, using ByteBuffer methods like asDoubleBuffer() or asShortBuffer() will offload some of the lowest level work.

Example:

   public void readBytes( final byte[] out, final int offset, final int count, final ByteBuffer buffer ) {
      buffer.get( out, offset, count );  // udates ByteBuffer `position` automatically
   }

   public void readShorts( final short[] out, final int offset, final int count, final ByteBuffer buffer ) {
      ShortBuffer sb = buffer.asShortBuffer();
      sb.get( out, offset, count );  // note that `count` reads two bytes for each `short`
   }

(Code compiles but not tested!)

markspace
  • 10,621
  • 3
  • 25
  • 39
0

One possibility, which will incur a performance penalty, is to use java.lang.reflect.Array to treat the array as an Object that then permits reusing the same code across all read methods.

@FunctionalInterface
public interface BitArrayReader {
    Object read(int bits);
}

private long readPrimitive(
        final Object out, final int offset, final int count, final int bits,
        final BitArrayReader reader) {
    final int total = offset + count;

    assert out != null;
    assert total <= Array.getLength(out);

    final long startPosition = position();

    for (int i = offset; i < total; i++) {
        Array.set(out, i, reader.read(bits));
    }

    return position() - startPosition;
}

@Override
public long readBooleans(boolean[] out, int offset, int count, int bits) {
    return readPrimitive(out, offset, count, bits, this::readBoolean);
}

The duplication has been addressed at the cost of some performance, a minor lack of compile-time type safety, and use of reflection.

Dave Jarvis
  • 30,436
  • 41
  • 178
  • 315