4

I'm working with Java 7, and I'm searching in the Guava API for a way to apply a function to an array without having to convert it to a Collection first. I'm willing to create my own class for such purpose, but I don't want to reinvent the wheel hehe.

So as a summary (in case you don't know exactly what I'm talking about), this is what I've found so far that you can do with Guava in order to apply a function to an array as I said:

Integer[] someNumbers = new Integer[]{1, 2, 3};
Integer[] returnedNumbers = Collections2.transform(Arrays.asList(someNumbers), squareNumberFunction).toArray(new Integer[0]);

assertThat(returnedNumbers).isEqualTo(new Integer[]{1, 4, 9});//Using AssertJ here

But I would like to be able to do something like this instead:

Integer[] someNumbers = new Integer[]{1, 2, 3};
Integer[] returnedNumbers = Arrays.transform(someNumbers, squareNumberFunction);

assertThat(returnedNumbers).isEqualTo(new Integer[]{1, 4, 9});

Ideally the functionality I'm talking about would be type-safe.


EDIT

For even further clarification of the problem:

  • The arrays I'm talking about are not primitive arrays, they reference complex objects (I only used integers to easily exemplify what I was talking about).
  • I have no control over the received or send structures, they are arrays (imagine a legacy code situation where that's possible if you think that helps you understand the problem better).
  • Efficiency is a must when transforming the arrays and accessing them.
Rodrigo Quesada
  • 1,460
  • 1
  • 14
  • 20
  • 4
    Generics and arrays don't go well together. Why are you using arrays in the first place instead of Lists? That's what you should do: prefer collections over arrays. – JB Nizet Nov 23 '14 at 10:35
  • Yeah, unfortunately the problem is that **I do need to use arrays** for this case, and also I need **efficiency** when transforming the arrays (can be of any type). – Rodrigo Quesada Nov 23 '14 at 10:48
  • 2
    You'll have better efficiency by using collections, because it allows avoiding copies. Guava is all about collections, and AFAIK, you won't find the method you want in Guava. But nothing forbids you to write a wrapper containing the code posted in your question. – JB Nizet Nov 23 '14 at 11:29
  • Actually, there is nothing more efficient than an array if you don't plan on "growing it" over time indefinitely and don't need to randomly access its elements by something else than the corresponding position index. Collections are just an abstraction that helps you easily manage groups of objects by implementing different data structures underneath and exposing different collection-oriented behavior through their API. – Rodrigo Quesada Nov 23 '14 at 21:11
  • 3
    You can create a list that is just a view over another list. You can't do that with an array. The only option is to make a copy. That's why collections can be faster. – JB Nizet Nov 23 '14 at 22:24
  • A view over another list? And how do you think that's implemented underneath? (In the case of the ArrayList collection that should be pretty obvious) That's what I'm talking about. :) – Rodrigo Quesada Nov 23 '14 at 22:47
  • 1
    by avoiding a copy. For example: `List listOfStrings = Lists.transform(listOfIntegers, i -> i.toString())` is a List created from a List, and doesn't create any copy. That's not possible with arrays. You can't create a String[] array that is a view of an Integer[] array. – JB Nizet Nov 24 '14 at 14:47
  • You might be right under certain circumstances (like when you don't need a copy of the array?), but the problem with that view you are talking about for my case is that **each time** the elements are **accessed**, the passed function is computed on the original element (not very efficient now that you take that into account, right?). – Rodrigo Quesada Nov 24 '14 at 20:46
  • @RodrigoQuesada: It all depends on how you need to use it. Views are very useful (and efficient!) when you only need to use the transformed collection once (or only a few times) and/or when your transformation function is very fast/cheap. If you're going to use the transformed collection a lot and don't need it to automatically reflect changes to the original, making a copy is a good idea. Of course, it's really easy to make a copy of a view using something like `ImmutableList.copyOf(view)`. – ColinD Nov 25 '14 at 19:00
  • Yeah, I agree with your statement, I use views and collections a lot, just not for this particular problem (as you can read in the post it has certain particular restrictions which would not be optimally addressed using views or conversions to collections). Views are indeed very useful and efficient in general (if used correctly as you mentioned), but that doesn't make them faster as JB implied (just generally more convenient to use on your code). – Rodrigo Quesada Nov 25 '14 at 20:16

2 Answers2

5

It's not true that arrays are considerably faster than collections - some collections are just wrappers over arrays so you loose nothing but gets the best of both worlds.

Coming back to your case I think you can use Arrays.asList (a list backed by your array) or ImmutableList (if you don't need growing array) whatever suits you but don't use Arrays. The very first sign is that you have to write unnecessary code (your answer and more in future) which you'll have to maintain/tune in future, in my experience it's not worth it.

Java is a higher level language and such tiny optimizations should be better left to Java for all the good reasons but even after all this if you still have doubts I would suggest you to write a micro benchmark to validate your assumption.

Community
  • 1
  • 1
Premraj
  • 7,802
  • 8
  • 45
  • 66
  • I never said that arrays are considerably faster than collections (maybe you wanna quote me?), I only said that nothing can be more efficient than an array given certain conditions. – Rodrigo Quesada Nov 24 '14 at 20:05
  • In the other hand I don't need **best of both worlds** for my problem, because my problem **lives** in **one world** only (I added further clarifications to my post, so you might want to check that). – Rodrigo Quesada Nov 24 '14 at 20:23
  • Also, I'm not planning on writing a micro benchmark because I'm pretty damn sure that **nothing** can beat iterating an array and accessing its elements sequentially (many, many times). – Rodrigo Quesada Nov 24 '14 at 20:32
  • 3
    @RodrigoQuesada, in native C I'm sure you are correct that nothing beats array access. In a managed language such as Java, and with a JIT, I'm not so sure. I'd recommend not being dismissive of micro benchmarks -- they are the perfect tool for communication of questions of this nature. – Kirk Woll Nov 25 '14 at 00:18
  • Well, I would be very surprised if a JIT were to translate a collection's bytecode into something more optimized than an array when generating the machine code. In any case though, I just solved my problem implementing a small generic method which is exactly what I need. – Rodrigo Quesada Nov 25 '14 at 00:41
  • 1
    @RodrigoQuesada, but I believe you are mistaken in thinking that an array in the JVM is really at all like an array in C. In the former, accessing the elements of the array *still* requires you to go through JVM bytecode -- you are *not* directly accessing the underlying memory space with a direct pointer. – Kirk Woll Nov 25 '14 at 20:56
  • I don't believe that, what I'm saying is that the collection's representation in bytecode and/or machine code cannot be faster than the corresponding array's representation (that should be pretty clear I think). – Rodrigo Quesada Nov 25 '14 at 21:07
  • Hmm, about your statement of "not directly accessing the underlying memory space", would you mind pointing me out to the article/document were you read that? Normally I would think that depends on the specific JIT compiler optimizations (such as HotSpot, which eliminates certain memory access checks for arrays under certain conditions). The thing is that a JVM is just that, a Java virtual machine, and therefor you should never assume a specific implementation of it (I mean, unless you are programming for a specific implementation, which is definitely not the norm). – Rodrigo Quesada Nov 25 '14 at 21:30
  • 1
    The underlying problem is that you are working with an array of objects: in Java, that means at best a series of _references_ stored in a single memory block. Those references could point anywhere else, meaning that any benefit you receive from quick array iteration is hampered by heap access. – Kenogu Labz Feb 03 '15 at 18:03
4

Okay, I ended up writing my own code in order to solve the aforementioned problem. So I hope someone else find it useful too, and don't just have to resort to using collections (come on, why would you have to?).

public static <A, B> B[] transform(Class<B> theReturnedValueType, Function<A, B> functionToApply, A... theValues) {

    B[] transformedValues = (B[]) Array.newInstance(theReturnedValueType, theValues.length);

    for (int i = 0; i < theValues.length; i++) {
        transformedValues[i] = functionToApply.apply(theValues[i]);
    }

    return transformedValues;
}

Example of how to use it (actually this is a test):

Integer[] returnedNumbers = ArrayTransformer.transform(Integer.class, squareNumberFunction, 1, 2, 3);

assertThat(returnedNumbers).isEqualTo(new Integer[]{1, 4, 9});

I want to point out that the example code is using vargargs syntax (it's a test, so it's a fun way to use it) but you can just obviously passed an array object instead.

Also, I must add that you need to be careful with this approach when using it on an environment sensitive to reflection performance hits (such as Android), because of the runtime creation of the new array using reflection (java.lang.reflect.newInstance(...)). In the other hand, the same holds true if you are using toArray(T[]) method of collections and the passed array's size isn't enough to hold all the collection elements (in which case a new array is allocated at run time, just as with my code), so maybe this code doesn't represent a problem if anyway you are using that method in the way I said (in any case you could easily change this code to fit your needs).

Rodrigo Quesada
  • 1,460
  • 1
  • 14
  • 20