optimizing java 8 stream operations

Question

So I was trying to optimize my code as much as possible. The following code used to run on 5,something seconds, however I managed to reduce it to about 1,4 seconds, however it's still not enough. What can I do to optimize this code even more? (Maybe I should mention that the times I talked about happen when the aux Map ends up with 170080 keys).

    public List<String> getProdutosMaisCompradosQuantidade(int filial, int X){
    Map<String, ProdutoFilial> aux;

    if(filial==0) {
        aux = new HashMap<>(ValoresFixos.CATALOGO_PRODUTOS_TAMANHO_INICIAL);
        filiais.stream()
               .forEach( (f) -> {
                    Map<String, ProdutoFilial> aux2 = f.getMapProdutosDadosFilialSemEncapsulamento();
                    aux2.forEach((k,t) -> {
                            if(t.getQuantidade()>0){
                                if(aux.containsKey(k)) aux.get(k).atualizarValores(t);
                                else aux.put(k,t);
                            }
                    });
                });
    }
    else aux = filiais.get(filial-1).getMapProdutosDadosFilialSemEncapsulamento();

    List<String> list = 
       aux
       .entrySet()
       .stream()
       .sorted(new ComparadorProdutoQuantidade())
       .map(e -> e.getKey()+"\n     |   o Quantidade: "+e.getValue().getQuantidade()+"; Comprado por "+e.getValue().getNumeroCompradores()+" Clientes Distintos")
       .collect(Collectors.toList());

    if(X>list.size()) X = list.size();
    list.subList(X, list.size()).clear();

    return list;

}

All the methods I use here are almost O(1) complexity and the comparator isn't too taxing either so that shouldn't be the problem, is there something I may not know that could help me optimize this stream operations? Maybe the entrySet I use can be avoided...? Because it's probably the most expensive operation here...

EDIT1: Maybe I should explain the idea behind this method. It's main purpose is to order the map aux and return a list with the keys ordered (the keys are also modified but that's not the main intention)

You are talking about "ordering" keys, but `aux` is a HashMap and those do not have a defined order. Have you thought about parallelizing everything using e.g. a `parallelStream()`instead of a `stream()`? If you do don't forget synchronization of `aux`. — Robert, Jun 08 '16 at 18:20
O(1) only means it doesn't scale with input size. It can take a constant year for everything for example. And for optimizing a few seconds, these constant time differences make a hell of a difference (and sometimes O(something larger than 1) is even faster in practice - typically arrays vs linked object structures). PS: it would be very helpful if you translated your code, understanding how valores and produtos relate to each other makes it a lot easier to understand your algorithm or what's it about. — zapl, Jun 08 '16 at 19:08

score 1 · Accepted Answer · edited May 23 '17 at 10:34

1

You could try to replace the forEach statements with map and collect, as described in Java 8 - Best way to transform a list: map or foreach?

Then you could try if using a parallel stream improves your performance.

It is probably possible to replace your statement which creates the aux map with the Stream API (using Collectors.toMap and/or Collectors.groupingBy). This is considered cleaner than using forEach, which uses stateful operations.

There are already quite many question for how to do that https://stackoverflow.com/search?q=groupingBy+[java-stream] or https://stackoverflow.com/search?q=toMap+[java-stream]

If you need a quicker solution (with less changes), you could try to replace the Map with a ConcurrentHashMap and use a parallel stream. You can use its merge function to make your computation parallelizable.

edited May 23 '17 at 10:34

Community

1
1

answered Jun 08 '16 at 19:06

user140547

7,750
3
28
80

I tried to use Collectors instead of forEach but I couldnt find a way to do it the way I pretend so I tried using a parallelSetream insted of a stream and it was just amazing, used it on the rest of my code and it made everything more then a couple of seconds faster like if it was nothing, I'm really surprised and amazed. However I wanna make sure I dont mess up, I use concurrentHashMaps when I can and use the reserved word synchronized on methods that really can't be called concurrently, but Is there anything else I should be catious about? – sharp_c-tudent Jun 09 '16 at 18:43
And also, is there any danger in using parallelStreams with ArrayLists or HashSets? As long as I just want to access data and not modify it – sharp_c-tudent Jun 09 '16 at 18:45
@sharp_c-tudent: When you use a `ConcurrentHashMap`, it should be ok if you use atomic operations like merge. As for the limitations of parallel streams, see http://stackoverflow.com/questions/21163108/custom-thread-pool-in-java-8-parallel-stream – user140547 Jun 09 '16 at 18:57
@sharp_c-tudent: Actually, to put it more clearly, you should prefer atomic operations like `putIfAbsent` or `merge` and not use `sychronized` with a `ConcurrentHashMap`. – user140547 Jun 09 '16 at 19:09
@sharp_c-tudent:If you only read from Collections concurrently without modifying them, it should be no problem – user140547 Jun 09 '16 at 19:39
Thanks and for the forEach situation i found that the compute metod on the maps could be used instead wich also helped a lot – sharp_c-tudent Jun 10 '16 at 12:34

optimizing java 8 stream operations

1 Answers1