16

I have a stream of objects like this:

"0", "1", "2", "3", "4", "5",

How can I transform it to stream of pairs :

{ new Pair("0", "1"), new Pair("2", "3"), new Pair("4", "5")}.

The stream size is unknown. I am reading data from a file that might be big. I have only iterator to collection and I transform this iterator to stream using spliterator. I know that here is a answer for processing adjacent pairs with StreamEx : Collect successive pairs from a stream Can this be done in java or StreamEx ? Thanks

niemar
  • 612
  • 1
  • 7
  • 16
  • what happens when the length is odd? what should be the Pair? – Vinay Prajapati Feb 28 '18 at 11:10
  • The length is always even. The Pair is a simple pojo that contains two objects (strings). – niemar Feb 28 '18 at 11:13
  • 1
    So @niemer length of your input stream is always going to be even. My question is what happens when it's odd? how would you like to manage it inside code? – Vinay Prajapati Feb 28 '18 at 11:17
  • Let's assume that odd number of elements is not a problem. – niemar Feb 28 '18 at 11:19
  • @niemar Vinay's remark is on point : using the `Stream` API (which allows lazily populated & size-limited operations) is kinda incompatible with the notion "the stream will always have an even number of elements". What will happen when the chain streams the first element ? the third ? etc ? – Jeremy Grand Feb 28 '18 at 12:41

5 Answers5

6

It's not a natural fit but you can do

List input = ...
List<Pair> pairs = IntStream.range(0, input.size() / 2)
                            .map(i -> i * 2)
                            .mapToObj(i -> new Pair(input.get(i), input.get(i + 1)))
                            .collect(Collectors.toList());

To create Pairs as you go in a stream you need a stateful lambdas which should be generally avoided but can be done. Note: this will only works if the stream is single threaded. i.e. not parallel.

Stream<?> stream = 
assert !stream.isParallel();
Object[] last = { null };
List<Pair> pairs = stream.map(a -> {
        if (last[0] == null) {
            last[0] = a;
            return null;
        } else {
            Object t = last[0];
            last[0] = null;
            return new Pair(t, a);
        }
     }).filter(p -> p != null)
       .collect(Collectors.toList());
assert last[0] == null; // to check for an even number input.
Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
4

If you don't want to collect the elements

The title of the question says collect pairs from a stream, so I'd assume that you want to actually collect these, but you commented:

Your solution works, the problem is that it loads the data from file to PairList and then I may use stream from this collection to process pairs. I can't do it because the data might be too big to store in the memory.

so here's a way to do this without collecting the elements.

It's relatively straightforward to transform an Iterator<T> into an Iterator<List<T>>, and from that to transform a stream into a stream of pairs.

  /**
   * Returns an iterator over pairs of elements returned by the iterator.
   * 
   * @param iterator the base iterator
   * @return the paired iterator
   */
  public static <T> Iterator<List<T>> paired(Iterator<T> iterator) {
    return new Iterator<List<T>>() {
      @Override
      public boolean hasNext() {
        return iterator.hasNext();
      }

      @Override
      public List<T> next() {
        T first = iterator.next();
        if (iterator.hasNext()) {
          return Arrays.asList(first, iterator.next());
        } else {
          return Arrays.asList(first);
        }
      }
    };
  }

  /**
   * Returns an stream of pairs of elements from a stream.
   * 
   * @param stream the base stream
   * @return the pair stream
   */
  public static <T> Stream<List<T>> paired(Stream<T> stream) {
    return StreamSupport.stream(Spliterators.spliteratorUnknownSize(paired(stream.iterator()), Spliterator.ORDERED),
        false);
  }

  @Test
  public void iteratorAndStreamsExample() {
    List<String> strings = Arrays.asList("a", "b", "c", "d", "e", "f");
    Iterator<List<String>> pairs = paired(strings.iterator());
    while (pairs.hasNext()) {
      System.out.println(pairs.next());
      // [a, b]
      // [c, d]
      // [e, f]
    }

    paired(Stream.of(1, 2, 3, 4, 5, 6, 7, 8)).forEach(System.out::println);
    // [1, 2]
    // [3, 4]
    // [5, 6]
    // [7, 8]
  }

If you want to collect the elements...

I'd do this by collecting into a list, and using an AbstractList to provide a view of the elements as pairs.

First, the PairList. This is a simple AbstractList wrapper around any list that has an even number of elements. (This could easily be adapted to handle odd length lists, once the desired behavior is specified.)

  /**
   * A view on a list of its elements as pairs.
   * 
   * @param <T> the element type
   */
  static class PairList<T> extends AbstractList<List<T>> {
    private final List<T> elements;

    /**
     * Creates a new pair list.
     * 
     * @param elements the elements
     * 
     * @throws NullPointerException if elements is null
     * @throws IllegalArgumentException if the length of elements is not even
     */
    public PairList(List<T> elements) {
      Objects.requireNonNull(elements, "elements must not be null");
      this.elements = new ArrayList<>(elements);
      if (this.elements.size() % 2 != 0) {
        throw new IllegalArgumentException("number of elements must have even size");
      }
    }

    @Override
    public List<T> get(int index) {
      return Arrays.asList(elements.get(index), elements.get(index + 1));
    }

    @Override
    public int size() {
      return elements.size() / 2;
    }
  }

Then we can define the collector that we need. This is essentially shorthand for collectingAndThen(toList(), PairList::new):

  /**
   * Returns a collector that collects to a pair list.
   * 
   * @return the collector
   */
  public static <E> Collector<E, ?, PairList<E>> toPairList() {
    return Collectors.collectingAndThen(Collectors.toList(), PairList::new);
  }

Note that it could be worthwhile defining a PairList constructor that doesn't defensively copy the list, for the use case that we know the backing list is freshly generated (as in this case). That's not really essential right now, though. But once we did that, this method would be collectingAndThen(toCollection(ArrayList::new), PairList::newNonDefensivelyCopiedPairList).

And now we can use it:

  /**
   * Creates a pair list with collectingAndThen, toList(), and PairList::new
   */
  @Test
  public void example() {
    List<List<Integer>> intPairs = Stream.of(1, 2, 3, 4, 5, 6)
        .collect(toPairList());
    System.out.println(intPairs); // [[1, 2], [2, 3], [3, 4]]

    List<List<String>> stringPairs = Stream.of("a", "b", "c", "d")
        .collect(toPairList());
    System.out.println(stringPairs); // [[a, b], [b, c]]
  }

Here's a complete source file with a runnable example (as a JUnit test):

package ex;

import java.util.AbstractList;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Objects;
import java.util.stream.Collector;
import java.util.stream.Collectors;
import java.util.stream.Stream;
import org.junit.Test;

public class PairCollectors {

  /**
   * A view on a list of its elements as pairs.
   * 
   * @param <T> the element type
   */
  static class PairList<T> extends AbstractList<List<T>> {
    private final List<T> elements;

    /**
     * Creates a new pair list.
     * 
     * @param elements the elements
     * 
     * @throws NullPointerException if elements is null
     * @throws IllegalArgumentException if the length of elements is not even
     */
    public PairList(List<T> elements) {
      Objects.requireNonNull(elements, "elements must not be null");
      this.elements = new ArrayList<>(elements);
      if (this.elements.size() % 2 != 0) {
        throw new IllegalArgumentException("number of elements must have even size");
      }
    }

    @Override
    public List<T> get(int index) {
      return Arrays.asList(elements.get(index), elements.get(index + 1));
    }

    @Override
    public int size() {
      return elements.size() / 2;
    }
  }

  /**
   * Returns a collector that collects to a pair list.
   * 
   * @return the collector
   */
  public static <E> Collector<E, ?, PairList<E>> toPairList() {
    return Collectors.collectingAndThen(Collectors.toList(), PairList::new);
  }

  /**
   * Creates a pair list with collectingAndThen, toList(), and PairList::new
   */
  @Test
  public void example() {
    List<List<Integer>> intPairs = Stream.of(1, 2, 3, 4, 5, 6)
        .collect(toPairList());
    System.out.println(intPairs); // [[1, 2], [2, 3], [3, 4]]

    List<List<String>> stringPairs = Stream.of("a", "b", "c", "d")
        .collect(toPairList());
    System.out.println(stringPairs); // [[a, b], [b, c]]
  }    
}
Joshua Taylor
  • 84,998
  • 9
  • 154
  • 353
  • Note: the OP wants to use a `Stream` of an unknown length as input. – Peter Lawrey Feb 28 '18 at 11:20
  • 1
    @PeterLawrey Yes, this works in this case. This provides a collector that works with a stream of any length. All this is doing is defining a `PairList` class that is a view on a list of its elements as pairs. Once we have that, all we *really* need is `collectingAndThen(toList(), PairList::new)`. – Joshua Taylor Feb 28 '18 at 11:20
  • Your solution works, the problem is that it loads the data from file to PairList and then I may use stream from this collection to process pairs. I can't do it because the data might be too big to store in the memory. – niemar Feb 28 '18 at 11:42
  • @niemar The title of the question says "**collect** pairs from a stream". Do you not want to actually *collect* them, but just to turn a `Stream` into a `Stream>`? (It might also be easier to turn the `Iterator` into an `Iterator>`, and then convert *that* to a stream. – Joshua Taylor Feb 28 '18 at 11:49
  • @niemar I've added a method for getting the pairs without collecting things into memory. It transforms the Iterator into an Iterator>, and can do the same for streams. Does that fit your use case better? – Joshua Taylor Feb 28 '18 at 12:03
  • Yes, you are right, the title was a little misleading. I wan to have Stream> – niemar Feb 28 '18 at 12:09
  • @niemar then the updated solution should hopefully work for you :) – Joshua Taylor Feb 28 '18 at 12:10
  • Yes, works great, I only modified your code to use Pair instead of ArrayList. – niemar Feb 28 '18 at 12:49
  • @niemar nice! And really, any binary function could be used to take adjacent elements to a value. I guess there's no need for that here, but it wouldn't be too hard. – Joshua Taylor Feb 28 '18 at 12:51
  • @JoshuaTaylor you can do this directly, like so https://stackoverflow.com/a/49029144/1059372 – Eugene Feb 28 '18 at 12:56
  • @Eugene, I'm always a fan of more direct, but implementing a spliterator based on an iterator seems more involved than wrapping the iterator in another iterator, and wrapping that in a spliterator. I'd probably consider the additional complexity a tradeoff for performance, if it's a bottleneck in execution. – Joshua Taylor Feb 28 '18 at 13:15
  • @JoshuaTaylor more involved and more complex? under covers your solution does a lot more, *at least* doing a `hasNext/next` sequence twice for different Iterators – Eugene Feb 28 '18 at 13:22
3

Assuming there is a Pair with left, right and getters and a constructor:

 static class Paired<T> extends AbstractSpliterator<Pair<T>> {

    private List<T> list = new ArrayList<>(2);

    private final Iterator<T> iter;

    public Paired(Iterator<T> iter) {
        super(Long.MAX_VALUE, 0);
        this.iter = iter;
    }

    @Override
    public boolean tryAdvance(Consumer<? super Pair<T>> consumer) {
        getBothIfPossible(iter);
        if (list.size() == 2) {
            consumer.accept(new Pair<>(list.remove(0), list.remove(0)));
            return true;
        }
        return false;
    }

    private void getBothIfPossible(Iterator<T> iter) {
        while (iter.hasNext() && list.size() < 2) {
            list.add(iter.next());
        }
    }

}

Usage would be:

 Iterator<Integer> iterator = List.of(1, 2, 3, 4, 5).iterator();
 Paired<Integer> p = new Paired<>(iterator);
 StreamSupport.stream(p, false)
            .forEach(pair -> System.out.println(pair.getLeft() + "  " + pair.getRight()));
Eugene
  • 117,005
  • 15
  • 201
  • 306
  • FYI, I just tried this and it does not work. There's a compilation error when creating the Paired object. Because the Paired class, expects a Pair iterator, but you are providing an Integer iterator. – Stefanos Kalantzis Dec 05 '22 at 08:11
3

I know I'm late to the party, but all of the answers seem to be really complicated or have a lot of GC overhead/short-lived objects (which is not a big deal with modern JVMs), but why not do it simply like this?

public class PairCollaterTest extends TestCase {
    static class PairCollater<T> implements Function<T, Stream<Pair<T, T>>> {
        T prev;

        @Override
        public Stream<Pair<T, T>> apply(T curr) {
            if (prev == null) {
                prev = curr;
                return Stream.empty();
            }
            try {
                return Stream.of(Pair.of(prev, curr));
            } finally {
                prev = null;
            }
        }
    }

    public void testPairCollater() {
        Stream.of("0", "1", "2", "3", "4", "5").sequential().flatMap(new PairCollater<>()).forEach(System.out::println);
    }
}

Prints:

(0,1)
(2,3)
(4,5)
J. Dimeo
  • 829
  • 8
  • 10
0

Just replace IntStream.range(1, 101) with your stream (you don't need to know your stream's size) -

import java.util.ArrayList;
import java.util.List;
import java.util.stream.IntStream;

public class TestClass {

    public static void main(String[] args) {

        final Pair pair = new Pair();
        final List<Pair> pairList = new ArrayList<>();

        IntStream.range(1, 101)
                .map(i -> {
                    if (pair.a == null) {
                        pair.a = i;
                        return 0;
                    } else {
                        pair.b = i;
                        return 1;
                    }
                })
                .filter(i -> i == 1)
                .forEach(i -> {
                    pairList.add(new Pair(pair));
                    pair.reset();
                });

        pairList.stream().forEach(p -> System.out.print(p + " "));
    }

    static class Pair {
        public Object a;
        public Object b;

        public Pair() {
        }

        public Pair(Pair orig) {
            this.a = orig.a;
            this.b = orig.b;
        }

        void reset() {
            a = null;
            b = null;
        }

        @Override
        public String toString() {
            return "{" + a + "," + b + '}';
        }
    }

}
Ashutosh A
  • 975
  • 7
  • 10