61

Java 8 provides Stream<T> specializations for double, int and long: DoubleStream, IntStream and LongStream respectively. However, I could not find an equivalent for byte in the documentation.

Does Java 8 provide a ByteStream class?

Tunaki
  • 132,869
  • 46
  • 340
  • 423
sdgfsdh
  • 33,689
  • 26
  • 132
  • 245
  • 3
    http://stackoverflow.com/questions/22918847/why-are-new-java-util-arrays-methods-in-java-8-not-overloaded-for-all-the-primit – assylias Sep 08 '15 at 14:00
  • Does this answer your question? [Why are new java.util.Arrays methods in Java 8 not overloaded for all the primitive types?](https://stackoverflow.com/questions/22918847/why-are-new-java-util-arrays-methods-in-java-8-not-overloaded-for-all-the-primit) – Andreas is moving to Codidact Jun 18 '20 at 00:07

5 Answers5

52

No, it does not exist. Actually, it was explicitly not implemented so as not to clutter the Stream API with tons of classes for every primitive type.

Quoting a mail from Brian Goetz in the OpenJDK mailing list:  

Short answer: no.

It is not worth another 100K+ of JDK footprint each for these forms which are used almost never. And if we added those, someone would demand short, float, or boolean.

Put another way, if people insisted we had all the primitive specializations, we would have no primitive specializations. Which would be worse than the status quo.

danorton
  • 11,804
  • 7
  • 44
  • 52
Tunaki
  • 132,869
  • 46
  • 340
  • 423
  • 105
    Seriously? Byte streams are used "almost never"? I wonder what planet that guy is living on, because in the real world streams of bytes are ubiquitous. – augurar Jul 25 '17 at 17:46
  • 6
    @augurar You'd have to ask [that guy](https://stackoverflow.com/users/3553087/brian-goetz) to know for sure :-) My impression is that the kind of byte streams most devs are familiar with are more on the line of `ByteArrayInputStream` / `ByteArrayOutputStream` (used for I/O-operations, bulk data processing, etc.). These objects are conceptually quite different from `Stream`s of the Java 8 Stream API, which are used in functional programming. – GOTO 0 Apr 06 '18 at 19:39
  • 18
    I'm with @augurar. There is `Arrays.stream(int[] array)`, `Arrays.stream(long[] array)` and `Arrays.stream(double[] array)` but not `Arrays.stream(byte[] array)` or the other primitive types. Actually, I find it rather ridiculous. – The Coordinator Nov 23 '19 at 01:55
  • 8
    Ah yes, it's nice to see the thing I wanted was not implemented because they just didn't feel like it. – Andrew T Finnell Apr 29 '20 at 02:48
  • 2
    Everyone - 1) You can implement it yourself. 2) You can find a 3rd-party implementation. 2a) If you can't find a 3rd-party implementation, that implies something about the degree to which `ByteStream` is actually needed. – Stephen C Oct 03 '20 at 05:49
  • 1
    Sometimes I'm very much... mmm... surprised with logic of these guys. Really. – Lev Sivashov Nov 29 '22 at 17:43
49

Most of the byte-related operations are automatically promoted to int. For example, let's consider the simple method which adds a byte constant to each element of byte[] array returning new byte[] array (potential candidate for ByteStream):

public static byte[] add(byte[] arr, byte addend) {
    byte[] result = new byte[arr.length];
    int i=0;
    for(byte b : arr) {
        result[i++] = (byte) (b+addend);
    }
    return result;
}

See, even though we perform an addition of two byte variables, they are widened to int and you need to cast the result back to byte. In Java bytecode most of byte-related operations (except array load/store and cast to byte) are expressed with 32-bit integer instructions (iadd, ixor, if_icmple and so on). Thus practically it's ok to process bytes as ints with IntStream. We just need two additional operations:

  • Create an IntStream from byte[] array (widening bytes to ints)
  • Collect an IntStream to byte[] array (using (byte) cast)

The first one is really easy and can be implemented like this:

public static IntStream intStream(byte[] array) {
    return IntStream.range(0, array.length).map(idx -> array[idx]);
}

So you may add such static method to your project and be happy.

Collecting the stream into byte[] array is more tricky. Using standard JDK classes the simplest solution is ByteArrayOutputStream:

public static byte[] toByteArray(IntStream stream) {
    return stream.collect(ByteArrayOutputStream::new, (baos, i) -> baos.write((byte) i),
            (baos1, baos2) -> baos1.write(baos2.toByteArray(), 0, baos2.size()))
            .toByteArray();
}

However it has unnecessary overhead due to synchronization. Also it would be nice to specially process the streams of known length to reduce the allocations and copying. Nevertheless now you can use the Stream API for byte[] arrays:

public static byte[] addStream(byte[] arr, byte addend) {
    return toByteArray(intStream(arr).map(b -> b+addend));
}

My StreamEx library has both of these operations in the IntStreamEx class which enhances standard IntStream, so you can use it like this:

public static byte[] addStreamEx(byte[] arr, byte addend) {
    return IntStreamEx.of(arr).map(b -> b+addend).toByteArray();
}

Internally toByteArray() method uses simple resizable byte buffer and specially handles the case when the stream is sequential and target size is known in advance.

Tagir Valeev
  • 97,161
  • 19
  • 222
  • 334
  • 3
    `baos1.write(baos2.toByteArray(), 0, baos2.size())` is an unnecessarily complicate merger. First, `toByteArray()` always returns an appropriately sized array, so `, 0, baos2.size()` is not needed. The reason, the array is always appropriately sized, is that it always returns a newly allocated array. If you want to avoid this overhead, consider using `baos2.writeTo(baos1)` instead, that’s shorter *and* more efficient. – Holger Oct 25 '16 at 13:32
  • 1
    By the way, the cast from `int` to `byte` is unnecessary when writing a single `byte` to an `OutputStream`, hence `ByteArrayOutputStream::write` is sufficient as accumulator function. – Holger Oct 25 '16 at 13:37
  • @Holger, both `writeTo` and `write(byte[])` declared throwing an `IOException`, so you would need an explicit try-catch. I just selected the shortest version (`write(byte[], int, int)` does not throw - crazy, I know). `writeTo` would be more efficient indeed. As for explicit cast, I don't remember. Probably I decided that such version would be more clear. – Tagir Valeev Oct 28 '16 at 04:04
  • 2
    Granted, `writoTo` requires a `try…catch` around it, so `{try{baos2.writeTo(baos1);}catch(IOException x){} }` is not shorter than `baos1.write(baos2.toByteArray(), 0, baos2.size())`, but it’s not significantly larger (but more efficient). `writeTo` had to declare `IOException` as you can pass an arbitrary `OutputStream` as argument. The `write(byte[])` method has not been overwritten, so unfortunately, it has the general `OutputStream.write(byte[])` signature. Reminds me on [this issue](http://stackoverflow.com/q/39648062/2711488)… – Holger Oct 28 '16 at 10:40
  • 1
    Quite a space requirement to store 8 bits each in a 32-bit location, isn't it? – Kaplan Apr 04 '20 at 07:47
  • 2
    @Kaplan A stream is not a storage structure. It’s a tool for *processing* data and, as this answer already says, “most of the byte-related operations are automatically promoted to int” in Java anyway. Which doesn’t hurt considering that today’s CPUs have 64 bit wide data registers anyway. The storage still is a `byte[]` here. – Holger Aug 16 '21 at 15:17
4

I like this solution since it does it at runtime from a byte [], rather than building a collection and then streaming from a collection. This just does one byte at a time to the stream I believe.

byte [] bytes =_io.readAllBytes(file);
AtomicInteger ai = new AtomicInteger(0);

Stream.generate(() -> bytes[ai.getAndIncrement()]).limit(bytes.length);

However this is quite slow due to the synchronization bottleneck of the AtomicInteger, so back to imperative loops!

David Buck
  • 3,752
  • 35
  • 31
  • 35
Adligo
  • 51
  • 2
  • I would suggest always measuring the performance before reaching such conclusions (not saying you didn't). Such usages of atomics are often surprisingly fast especially if no actual runtime contention events actually occur. – Chris Mountford Jun 29 '23 at 12:12
3

Use com.google.common.primitives.Bytes.asList(byte[]).stream() instead.

sdgfsdh
  • 33,689
  • 26
  • 132
  • 245
Inshua
  • 1,355
  • 1
  • 12
  • 12
2

if you don't have a ByteStream, build one

Stream.Builder<Byte> builder = Stream.builder();
for( int i = 0; i < array.length; i++ )
  builder.add( array[i] );
Stream<Byte> stream = builder.build();

...where array can be of type byte[] or Byte[]

Stefan Steinegger
  • 63,782
  • 15
  • 129
  • 193
Kaplan
  • 2,572
  • 13
  • 14