8

Throughout the Google Guava library, I have noticed the tendency to use the "one (or two) plus var args" technique.

Examples:

  • void add(T value, T... moreValueArr)
  • void add(T value, T value2, T... moreValueArr)

It took me a while to figure out why: To prevent call with zero args (in first case) or one arg (in second case).

Expanding further on this technique, if given the choice between scenarios A and B below, which is preferable? I am hoping someone with deep Java knowledge can provide insight.

Scenario A: (two methods)

  1. void add(T... valueArr)
  2. void add(Iterable<? extends T> iterable)

Scenario B: (three methods)

  1. void add(T value, T... moreValueArr)
  2. void add(T[] valueArr)
  3. void add(Iterable<? extends T> iterable)

One idea why B might be better: I have noticed many Java programmers are not aware that arrays can be passed directly as var args. Thus, B might provide a hint about what is possible.

Finally, I realize B has additional development, testing, and maintenance overhead. Please leave those considerations aside.

This question is a subtle variation on my original question: Java varags method param list vs. array

Community
  • 1
  • 1
kevinarpe
  • 20,319
  • 26
  • 127
  • 154
  • 1
    `void add(T... valueArr)` and `void add(T[] valueArr)` are same but you need to construct the array manually to pass it in second form. – Braj Aug 09 '14 at 11:00
  • `void add(T... valueArr)` means **zero and more** T type of objects are acceptable where as `void add(T value, T... moreValueArr)` means **one and more**. Now the choice depends on the program and requirement. – Braj Aug 09 '14 at 11:01
  • `void add(Iterable iterable)` is something different, that is more preferable in case of **Collection** object – Braj Aug 09 '14 at 11:04
  • 3
    @user3218114 - An answer is better than 3 comments :) – TheLostMind Aug 09 '14 at 11:05
  • 1
    @TheLostMind OK just deleted. – Braj Aug 09 '14 at 11:15
  • I'm somewhat confused: The question already implies that you noticed that the intention here was to *enforce* a certain number of parameters. So A and B are the same, *except* for the fact that in B1, at least one parameter is enforced - but in B2 and B3 this is not the case, so why have B1 at all? (And if you leave B1 out, then A and B basically *are* the same...). – Marco13 Aug 11 '14 at 19:47

5 Answers5

7

Conclusion will be drawn at the end. Skip to the end if you just want the conclusion.


The primary goal is performance.

If there are lots of use cases where there is only 1 or 2 elements that would get passed, you can avoid the creation of an array. Yes, there is still a zero-length array that will be passed, but since a zero-length array cannot be modified, the JVM is allowed to pass a shared instance for example which if cached has no performance impact at all.

The most shining example for this is the EnumSet.of() methods (which returns an EnumSet of the listed enum instances).

You'll see the following overloads:

static <E extends Enum<E>> EnumSet<E> of(E e);
static <E extends Enum<E>> EnumSet<E> of(E first, E... rest);
static <E extends Enum<E>> EnumSet<E> of(E e1, E e2);
static <E extends Enum<E>> EnumSet<E> of(E e1, E e2, E e3);
static <E extends Enum<E>> EnumSet<E> of(E e1, E e2, E e3, E e4);
static <E extends Enum<E>> EnumSet<E> of(E e1, E e2, E e3, E e4, E e5);

If you call the of() method with 5 or less elements, there will be no array created because there are overloads that can take 5 or less elements. Performance is also mentioned in the javadoc of EnumSet.of(E first, E... rest)

This factory, whose parameter list uses the varargs feature, may be used to create an enum set initially containing an arbitrary number of elements, but it is likely to run slower than the overloadings that do not use varargs.

As to why to use of(E first, E... rest) even if there is a separate of(E e1, E e2):

This is simply a convenience for the implementation. If you declare a first parameter, you can use it without having to check the array's length or having to use indexing. You can use to check its type for example (which is often important when working with generics).

It doesn't really force to pass at least one argument because you can just as easily pass a null as you could pass an empty array if there would only be the vararg parameter.

Vararg vs array

If the type of the array is not a primitive type then there is no real difference, except the array parameter forces you to explicitly create the array while the vararg parameter allows you to either pass an array or just list the elements.

Despite the historical reasons for array parameter (varargs only joined Java with 5.0) there are still places when arrays have to be used:

  1. If the array parameter is an "outgoing" parameter meaning the method receiving the array whishes to fill the array. Example: InputStream.read(byte[] b)
  2. If there are other parameters, possibly multiple arrays, obviously vararg is not enough as there can be only one vararg parameter (which also must be at the end).

Iterable parameter

Iterable in this case is an alternative to pass multiple values to the method but it is not a replacement for array and vararg parameters. Iterable is for collections, as arrays or listing the elements cannot be used if the parameter is Iterable (arrays don't implement Iterable). If the caller has the input data in the form of a collection (e.g. List or Set), the Iterable parameter is the most convenient, the most general and the most efficient way to pass the elements.


Drawing a conclusion

Back to your original Scenario A and B. Since both of your scenarios contain the method with the Iterable parameter and since Iterables do not "mix" with arrays, I reduce the question by omitting those:

Scenario A: (one method)

  1. void add(T... valueArr)

Scenario B: (two methods)

  1. void add(T value, T... moreValueArr)
  2. void add(T[] valueArr)

Since the method add() only reads from the array (and not writes to it; assumed from the name add), every use case can be solved with both of them. I would go with Scenario A for simplicity.

icza
  • 389,944
  • 63
  • 907
  • 827
  • Why was this downvoted? It is a detailed, thoughtful answer. – kevinarpe Aug 14 '14 at 03:47
  • I have no idea. I respect rightful downvotes but I'd like to know the reason so I can learn from it and improve my answers. – icza Aug 14 '14 at 05:52
  • 1
    I didn't downvote, but your argument that "*It doesn't really force to pass at least one argument because you can just as easily pass a `null`*" is inaccurate. `method()` and `method(null)` are two different calls, and the latter has one argument. The `of(E first, E... rest)` pattern does enforce a minimum parameter count. Null checking is a separate issue. – dimo414 Aug 14 '14 at 15:28
  • @icza: Tons of great ideas / points here. Can you draw a conclusion about A vs. B? – kevinarpe Aug 16 '14 at 11:09
  • Edited my answer to add a conclusion. – icza Aug 18 '14 at 08:00
5

There's no particular reason the choices you offer as A and B are mutually exclusive, nor the only options available. Instead, recognize that varargs solve a different problem than collections.

Guava (and modern Java best practices) highly encourage using collections over arrays. They are much more extensible and interchangeable than arrays, and offer more powerful abstractions, such as lazy iteration.

On the other hand, varargs provide a nice way to invoke methods. If you expect people to want to pass in arguments directly, rather than as part of a collection, varargs is more convenient. Just about all varargs style methods should simply call your collection equivalent methods, and do as little else as possible, however, because that convenience becomes inconvenience as soon as you have to actually work with the arrays. There is very little value in creating new methods that take arrays as arguments directly.

But why the weird vararg signatures?

There's a few reasons you want or need the type of signatures you're seeing Guava use a lot:

  1. To enforce stricter signatures:

    You can call a vararg method with any number of arguments, including zero. If your method doesn't make any sense without any parameters, a signature of void add(T one, T ... rest) enforces that requirement at the type level.

  2. To avoid generics conflicts:

    Sometimes you'll want to define a varargs method that, due to type erasure, is identical to another method. For instance, if you defined void add(T ... arr) as well as void add(Iterable<T> iter) with a generic enough type, it's possible passing an iterable would actually match the varargs method. Using void add(T one, T ... arr) helps keep these methods distinct for the compiler. (I'll try to find a more concrete example of this issue).

  3. To avoid object creation:

    Sometimes you'll see method signatures that seem like they could be varargs, but aren't, such as the overloads of ImmutableList.of(). The purpose of these methods is to avoid the extra array allocation that must happen behind the scenes in a varargs call. This is really more of a case of leaky abstractions than anything else, and functionally it's safe to ignore. Unless you're implementing a method that you can expect to be called as often as things like Guava utilities, the allocation savings are probably not worth the added code complexity.

In conclusion, stick to writing methods that take Iterable, Iterator, or an appropriate Collection sub-interface. If you anticipate users of your methods wanting to pass in individual values, consider adding varargs methods that wrap the arguments into collections internally, as a convenience for the caller.

Community
  • 1
  • 1
dimo414
  • 47,227
  • 18
  • 148
  • 244
2

I'd go with a slightly modified version of b:

void add (T first, T ... more){
   // call the second version
   add(Lists.asList(first, more);
}
void add(Iterable<? extends T> data){
   ... // do stuff here
}

You should not provide an array method. Arrays are outdated and a maintenance nightmare. If clients of your library actually have an array in hands they can still pass it to the second method wrapped in Arrays.asList(...)

If you are more of a control freak you might do an additional check in the first method:

void add (T first, T ... more){
   // call the second version
   add(more.length == 0 
             ? Collections.singleton(first)
             : Lists.asList(first, more));
}

Although I doubt it will be any more efficient than just using Lists.asList

Sean Patrick Floyd
  • 292,901
  • 67
  • 465
  • 588
  • See suggestion from @kajacx. At the very least, I should add `? extends` to my `Iterable` type. – kevinarpe Aug 09 '14 at 12:28
  • @kevinarpe `extends` granted and added. the rest is nonsense. having a method for adding zero or more values makes no sense, zero elements is no-op and should not be allowed. Arrays are a relic from the past and should also be avoided (See Effective Java 2nd Edition by Joshua Bloch, Item 25: Prefer Lists to Arrays). – Sean Patrick Floyd Aug 09 '14 at 12:33
  • 1
    While reasonable most of the time, I would hesitate to make a blanked claim like always add `? extends`. This changes the type you're working with, sometimes in unexpected ways. Prefer to be restrictive with your types. You can always expand from `` to ` extends T>` if you discover a need, but you will break backwards compatibility if you realize your use case cannot support `? extends`. – dimo414 Aug 11 '14 at 22:14
  • @dimo414: Interesting point. Can you provide an example where `? extends` does not work? – kevinarpe Aug 14 '14 at 03:44
  • @kevinarpe Well, a trivial example would be any case where you should be using `? super` instead. In general, `? extends` *is* probably going to do what you want, but it being a generally reasonable decision is a far cry from making it a rule for all generic method signatures. Err on the side of caution and declare only the types you know you need; you can always expand your type signature later. – dimo414 Aug 14 '14 at 15:24
  • @dimo414: I agree: `? extends` seems easier to get right in comparison with `? super` – kevinarpe Aug 16 '14 at 11:14
0

I would go with A, and encourage users (in method documetation) to pass an array to your method instead of varargs, since anonnymous array has to be created each time varargs method is called, as described here.

Also you can improve your add(Iterable<T> data) method by using a wildcard and change it's signature to add(Iterable<? extends T> data).

Edit: (reply to @Torben's comment)

Let's say you call add(a,b,c) a lot and if anonymous array get's created every time, as described in the link above, then you will waste time allocating memory on heap as well as making your garbage collector busy.

On the other hand, I'm not sure how much optimization is being done, so it could not be that horrific. However you can make sure that no arrays will be created, by creating your own 1 array and keep using that:

YourType[] arr3 = new YourType[3]; //declared as field
...
arr3[0] = a;
arr3[1] = b;
arr3[2] = c;
add(arr3);

But this is not thread-safe, so in multithread enviroment, make sure that each thread uses different array.

kajacx
  • 12,361
  • 5
  • 43
  • 70
  • That does not make sense. If the method requires an arbitrary number of parameters, then an array or list must be used. There is no performance difference between implicit and explicit array creation (except that the compiler can probably optimize implicit array creation more efficiently). – Torben Aug 14 '14 at 06:23
0

The obvious difference is that if you define a methods

void a(int[] args) { }
void b(int ... args) { }

You can not call

a(1, 2, 3);

But you can call

b(new int[] { 1, 2, 3 });

There is really only need for array-typed parameters if you want to have parameters after the array.

a(int[] args, int arg);
a(int ... args, int arg); // Won't even compile.

The other difference is that if your method requires at lest one parameter, then you have to do parameter length checking unless you define your method like this

c(int arg, int ... args) { }

You make a trade off in the code, as you have to process the first argument differently from the rest, but IMO the stricter API definition is worth it.

Torben
  • 3,805
  • 26
  • 31