0

I have a method which return list of items and takes a limit (used by Stream#limit) as parameter:

public List<Integer> getItems(Long limit) {
    return IntStream.range(1, 10)
            .limit(limit)
            .boxed()
            .collect(Collectors.toList());
}  

How to set the parameter to take all items (with no limit)?

My attempts:

    Long limit5 = 5L;
    System.out.println("With limit 5:" + getItems(limit5));
    // works fine: 5 items

    Long noLimitZero = 0L;
    System.out.println("Without limit (zero): " + getItems(noLimitZero));
    // why 0 mean "no items" instead of "all items"

   Long noLimitNegative = -1L;
    System.out.println("Without limit (negative number): " + getItems(noLimitNegative));
    // IllegalArgumentException

    Long noLimitNull = null;
    System.out.println("Without limit (null): " + getItems(noLimitNull));
    // NullPointerException

Passing Long.MAX_VALUE is not a solution.

MongoDB inconsistency

For example MongoDB's FindIterable#limit can take 0 or null as no limit.

public List<Integer> getItems(Long limit) {
    MongoDatabase mongo = new MongoClient().getDatabase("example");
    MongoCollection<Document> documents = mongo.getCollection("items");
    FindIterable<Document> founded = documents.find();
    List<Integer> items = new ArrayList<>();
    for (Document doc : founded.limit(limit.intValue())) {
        items.add(doc.getInteger("number"));
    }
    return items;
}

This inconsistency between methods causes incompatibility, for example one interface with method List<Integer> getItems(Long limit) and two implementations: in memory and MongoDB.

Consistency in methods Stream#skip and FindIterable#skip is preserved.

          --------------------------
          | Java       | Mongo     |
------------------------------------
limit = 0 | none items | all items |
------------------------------------
skip = 0  | none skip  | none skip |
------------------------------------

Refactor method with Stream#limit

I guess there is no way to pass "no limit" parameter to Stream#limit, so I must refactor this method to takes "limit" and 0 or null or -1 as "no limit".

public static List<Integer> getItems(Long limit) {
    if (limit == null || limit == 0 || limit == -1) {
        return IntStream.range(1, 10)
                .boxed()
                .collect(Collectors.toList());
    } else {
        return IntStream.range(1, 10)
                .limit(limit)
                .boxed()
                .collect(Collectors.toList());
    }
}

Or:

public static List<Integer> getItems(Long limit) {
    IntStream items = IntStream.range(1, 10);
    if (limit != null && limit != 0 && limit != -1) {
        items = items.limit(limit);
    }
    return items.boxed()
            .collect(Collectors.toList());
}

There is a better way to achieve consistency between methods limit?

mkczyk
  • 2,460
  • 2
  • 25
  • 40
  • What you have looks fine to me – GBlodgett Mar 24 '19 at 20:27
  • 2
    Why is "passing `Long.MAX_VALUE` not a solution"? – daniu Mar 24 '19 at 21:24
  • Passing `Long.MAX_VALUE` is not a solution because it is only a workaround. In this case (in memory implementation replacement for fetching from database) theoretically is the risk that may be more than `2^63-1` records ;) (but practically it isn't an argument). But I would like to know how to work with Java streams. Java streams in memory can be very long or even infinity (for example `IntStream.iterate(0, i -> i + 1)`) and there is may be a case where `Long.MAX_VALUE` is too small. – mkczyk Mar 24 '19 at 22:16
  • Use the last version. When you don’t want a limit, don’t call `limit`, but avoid code duplication. Of course, when the stream source does already support a size (like `IntStream.range`, `Arrays.stream(…)`, `List.subList(…).stream()`, or `Random.ints(…)`), you should prefer specifying the size in the first place. Using `limit`, even with `Long.MAX_VALUE` *has* performance drawbacks. – Holger Mar 25 '19 at 08:31

1 Answers1

0

So there's several layers of problems with what you're trying to do.

You say "practicality isn't an argument" and that's fine, but let me just point out that Long.MAX_VALUE does exceed the amount of atoms on earth, so the probability that you're getting more entries than that from a database is really small. Not to mention that you go on collecting that data into a list so you might run into memory issues in your own application as well.

So the second thing is that the semantics of limit() is that it imposes a fixed limit on the number of entries and "infinity" is not a fixed limit; hence limit() just isn't what you're looking for.

Third, you seem to be looking for a way around that so there us a pattern you can use, and that's maintaining your own counter. What you want is something like an AtomicBigInteger which doesn't exist in the JDK but is shown here.

So what you'd do is create a Predicate like this

class BelowValue<T> implements Predicate<T> {
    BigInteger limit = BigInteger.ZERO;
    AtomicBigInteger counter = new AtomicBigInteger();

    public BelowValue(BigInteger limit) {
        this.limit = limit;
    }        
    public BelowValue() {}

    public boolean test(T ignored) {
        // short circuit on zero
        if (BigInteger.ZERO.compareTo(limit) == 0) { return true; }

        // check actual condition
        return  counter.incrementAndGet().compareTo(limit) > 0;
    }
}

and then you can use it in your stream with (Java 8)

Predicate<T> filter = new BelowValue<>(limit);
return stream
    .filter(filter)
    .boxed()
    .collect(Collectors.toList());

Note however that filter is not a short-circuiting operation, so if you have an infinite stream, this will not terminate (and be very inefficient if your stream is much longer than the limit size).

Java 9's takeWhile is short-circuiting, so you can substitute that for filter in the above example.

daniu
  • 14,137
  • 4
  • 32
  • 53
  • Don’t do this. The counter may be thread-safe, your assumption that the processing order matches the encounter order is not. – Holger Mar 25 '19 at 08:34
  • @Holger I did not assume that. – daniu Mar 25 '19 at 08:45
  • Your code does. It will break with a parallel stream, as the counter based predicate allows element to pass based on the order it is evaluated. Which is not the semantic order as a `limit` operation is supposed to let pass the *first* *n* elements in encounter order. – Holger Mar 25 '19 at 08:50