8

Building upon Populating a List with a contiguous range of shorts I tried generating an array of primitive shorts. This turned out to be surprisingly harder than expected.

Short[] range = IntStream.range(0, 500).mapToObj(value -> (short) value).toArray(Short[]::new) worked but:

short[] range = IntStream.range(0, 500).mapToObj(value -> (short) value).toArray(short[]::new) generated a compiler error:

method toArray in interface Stream<T> cannot be applied to given types;
  required: IntFunction<A[]>
  found: short[]::new
  reason: inference variable A has incompatible bounds
    equality constraints: short
    upper bounds: Object
  where A,T are type-variables:
    A extends Object declared in method <A>toArray(IntFunction<A[]>)
    T extends Object declared in interface Stream

This seems to be an intersection of two problems:

  1. The primitive Stream APIs do not provide an implementation for shorts.
  2. The non-primitive Stream APIs do not seem to provide a mechanism to return a primitive array.

Any ideas?

Community
  • 1
  • 1
Gili
  • 86,244
  • 97
  • 390
  • 689
  • 1
    You're making your own life complicated by using short instead of int. That won't bring any performance or memory advantage, but make your code harder to write. I've not used a short, or seen a short in any API, for years. – JB Nizet Jun 11 '15 at 20:30
  • @JBNizet I get that, but the field corresponds to a database column of type `short` (I cannot change the database schema for legacy reasons). By storing `short` up-front I prevent the possibility of a casting error at the last minute, before sending the data to the database. – Gili Jun 11 '15 at 20:33
  • 2
    You correctly understand the problem. The streams library designers decided it wasn't worth adding ShortStream etc., unboxing does not interoperate with generics, and adding special `toShortArray()` methods would also not work because you could call them on, say, a `Stream`. – Jeffrey Bosboom Jun 11 '15 at 20:34
  • 6
    If you really want an array of shorts, I would simply use a for loop rather than trying to use streams. – JB Nizet Jun 11 '15 at 20:37
  • @JBNizet can you elaborate on where there is no performance or memory advantage to using a short for something? It's literally half the number of bytes of integers. We're working with efficient data storage here for an embedded system, so a 50% reduction in data size is significant. – GuyPaddock Apr 18 '20 at 15:35
  • 1
    @GuyPaddock For performance reasons, the JVM could be memory-aligning shorts to 32-bit or 64-bit boundaries. See https://stackoverflow.com/a/1496881/14731 and a counter-argument at https://lemire.me/blog/2012/05/31/data-alignment-for-speed-myth-or-reality/. Point is: we don't know what is going on under the hood. If I were you, I'd find a way to measure actual memory usage for your particular setup (hardware/software) and if it makes a difference then by all means go for it. – Gili Apr 19 '20 at 16:10
  • @JBNizet I disagree. I worked on a streaming application where we beg for saving a single byte, that's merely due to the amount of data we have to ingest and process every second! Hence, I had to use `short` a few times! – Yahya Aug 29 '22 at 14:41

2 Answers2

2

You may consider using my StreamEx library. It extends standand streams with additional methods. One of the goals of my library is better interoperation with old code. In particular it has IntStreamEx.toShortArray() and IntStreamEx.of(short...):

short[] numbers = IntStreamEx.range(500).toShortArray();
short[] evenNumbers = IntStreamEx.of(numbers).map(x -> x*2).toShortArray();

Note that it's still the stream of int numbers. When calling toShortArray(), they are automatically converted to short type using (short) cast operation, thus overflow is possible. So use with care.

There are also IntStreamEx.toByteArray(), IntStreamEx.toCharArray(), and DoubleStreamEx.toFloatArray().

Tagir Valeev
  • 97,161
  • 19
  • 222
  • 334
1

The canonical way would be implementing a custom Collector.

class ShortCollector {
    public static Collector<Integer,ShortCollector,short[]> TO_ARRAY
        =Collector.of(ShortCollector::new, ShortCollector::add,
                      ShortCollector::merge, c->c.get());

    short[] array=new short[100];
    int pos;

    public void add(int value) {
        int ix=pos;
        if(ix==array.length) array=Arrays.copyOf(array, ix*2);
        array[ix]=(short)value;
        pos=ix+1;
    }
    public ShortCollector merge(ShortCollector c) {
        int ix=pos, cIx=c.pos, newSize=ix+cIx;
        if(array.length<newSize) array=Arrays.copyOf(array, newSize);
        System.arraycopy(c.array, 0, array, ix, cIx);
        return this;
    }
    public short[] get() {
        return pos==array.length? array: Arrays.copyOf(array, pos);
    }
}

Then you could use it like

short[] array=IntStream.range(0, 500).boxed().collect(ShortCollector.TO_ARRAY);

The drawback is that Collectors only work for reference types (as Generics doesn’t support primitive types), thus you have to resort to boxed() and collectors can’t utilize information about the number of elements (if ever available). Thus, the performance is likely to be far worse than toArray() on a primitive data stream.

So, a solution striving for higher performance (I limit this to the single threaded case) will look like this:

public static short[] toShortArray(IntStream is) {
    Spliterator.OfInt sp = is.spliterator();
    long l=sp.getExactSizeIfKnown();
    if(l>=0) {
        if(l>Integer.MAX_VALUE) throw new OutOfMemoryError();
        short[] array=new short[(int)l];
        sp.forEachRemaining(new IntConsumer() {
            int ix;
            public void accept(int value) {
                array[ix++]=(short)value;
            }
        });
        return array;
    }
    final class ShortCollector implements IntConsumer {
        int bufIx, currIx, total;
        short[][] buffer=new short[25][];
        short[] current=buffer[0]=new short[64];

        public void accept(int value) {
            int ix = currIx;
            if(ix==current.length) {
                current=buffer[++bufIx]=new short[ix*2];
                total+=ix;
                ix=0;
            }
            current[ix]=(short)value;
            currIx=ix+1;
        }
        short[] toArray() {
            if(bufIx==0)
                return currIx==current.length? current: Arrays.copyOf(current, currIx);
            int p=0;
            short[][] buf=buffer;
            short[] result=new short[total+currIx];
            for(int bIx=0, e=bufIx, l=buf[0].length; bIx<e; bIx++, p+=l, l+=l)
                System.arraycopy(buf[bIx], 0, result, p, l);
            System.arraycopy(current, 0, result, p, currIx);
            return result;
        }
    }
    ShortCollector c=new ShortCollector();
    sp.forEachRemaining(c);
    return c.toArray();
}

You may use it like

short[] array=toShortArray(IntStream.range(0, 500));
Holger
  • 285,553
  • 42
  • 434
  • 765