1

I wanted to optimize the below code. Will streams optimize the below nested foreach loops? If so am new to streams can someone please help me? In the below I have replaced the names for project confidentiality purpose. Shall be using the tList for further processing in the code. Can some one please help on this?

List<Tea> tea = requestBody.getTea();
            
for (Tea tea1 : teas) {
    List<String> teaValues = tea1.getTeaValues();
    for (String t : teaValues) {
        if ((t).contains("tMapping") || ((t).contains("tdata"))) {
            int subStrng = t.indexOf(".") + 1;
            int subStrngSpace = t.indexOf(" ");
            String tStrng = t.substring(subStrng, subStrngSpace);
            tList.add(tStrng);
        } else {
            String[] tStrng = t.split("\'");
                String t1 = tStrng[1];
                tList.add(t1);
            
        }
    }
}
Aspirer
  • 55
  • 8
  • If this code works, you should ask at [codereview.se] instead. With that said, I'm not sure there is a lot that can be improved here, beyond minor tweaks that would only have a significant effect if this code is executed a lot. – Andy Turner Nov 12 '21 at 11:04
  • 1
    Why do you have a third for inside your if/else, that seems extraneous since you could just do tList.add(t.split("\'")[1]) and it should come out to the same result? Unless t1 isn't actually teaValues. In general, how much data are you processing here? It might just be that a stream isn't exactly necessary (One of the big advantages of Streams is parallel processing for instance) Edit: What andy said – Eskir Nov 12 '21 at 11:04
  • 2
    @AndyTurner the question needs work before it's suited to [codereview.se]. You should have pointed the asker at [A guide to Code Review for Stack Overflow users](//codereview.meta.stackexchange.com/a/5778), as some things are done differently over there - e.g. we need a good description of the *purpose* of the code to give context, and question titles should simply say what the code *does* (the question is always, "_How can I improve this?_"). It's important that the code works correctly; include the unit tests if possible. – Toby Speight Nov 12 '21 at 11:36
  • 1
    What is the purpose of reiterating `teaValues` in the `else` part for all non-matched entries, and adding the same entries? If this is _really_ needed, the list should be prepared once and added to `tList.addAll(preparedList)` – Nowhere Man Nov 12 '21 at 12:05
  • @Eskir and Alex Rudenko: Edited the code as suggested. Thank you. – Aspirer Nov 12 '21 at 12:55
  • 1
    Since you expect `subStrngSpace` to be larger than `subStrng`, you can use `int subStrngSpace = t.indexOf(" ", subStrng);` to start the search at this point instead of the beginning of the string. Further, it seem the `else` branch is basically doing the same, extract the substring between two delimiters, just using different delimiters. So it doesn’t have to resort to `split` (which creates an arbitrarily large array of unneeded substrings). Use the same `indexOf` based extraction for both cases. You can even use the same code when you use the conditional to select the appropriate delimiters. – Holger Nov 12 '21 at 15:18

1 Answers1

3

Will streams optimize the below nested foreach loops?

No, not really. Unless you have a really big input and you're using a server with multiple processes/cores, then you can get a speed up from using parallel streams.

Otherwise, these streams will just be converted into loops "under the hood" ...

This is to answer your question.

Now let's look at the question behind the question ... why do you want to optimise it ? How will you know that your optimisation level is good enough ?

EDIT:

  1. If this is to be designed to handle super big inputs you need to make use of concurrency. The easiest way to do it would be to use parallel streams.
  2. Even better would be to move this processing code from the application into the database. It will be able to process this even faster. Just write a native query (see here) which does all the transforms of data within the database.
Arthur Klezovich
  • 2,595
  • 1
  • 13
  • 17
  • Yes the input might be huge. Am not that sure of optimizing. I was asked to optimize this to take huge inputs. – Aspirer Nov 12 '21 at 11:19
  • How huge will it can be ? – Arthur Klezovich Nov 12 '21 at 11:21
  • With tea and teaValues being multiple. – Aspirer Nov 12 '21 at 11:22
  • OK, so first thing is ... you need to be able to measure this stuff ... you can start by covering the code with a performance test like this maybe https://www.codeproject.com/Articles/1251046/Doing-Performance-Testing-Easily-using-JUnit-and-M – Arthur Klezovich Nov 12 '21 at 11:22
  • > With Tea and teaValues being multiple. Are we talking about 10s of records, hundreds, millions ? – Arthur Klezovich Nov 12 '21 at 11:23
  • Or maybe billions ? – Arthur Klezovich Nov 12 '21 at 11:23
  • If you are going to have a super big input, you need to use parallel streams for sure. – Arthur Klezovich Nov 12 '21 at 11:23
  • maybe, but do not just throw `parallel` into stream piplines without thinking further, this will yield a linear speedup at best (proportional to number of available cores), and it might cause problems elsewhere. See the docs for package [`java.util.stream`](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/stream/package-summary.html) for details. The default implementation uses the common [ForkJoinPool](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/concurrent/ForkJoinPool.html), which might cause contention with other parts of the application – Hulk Nov 12 '21 at 12:49
  • See https://stackoverflow.com/questions/30802463/how-many-threads-are-spawned-in-parallelstream-in-java-8 – Hulk Nov 12 '21 at 12:51