0

I have to get the most common ip from an logfile, I have a working method but I need some help with optimizing it/ I have to replace som passages with streams. Can you guys help me please ?

  private static String mostCommonIp(String fileName) throws IOException
{
    Map<String, Long> ipAppearanceCount = new HashMap<>();
    String commonMostIp ="";
    List<String> lines = Files.lines(Paths.get(fileName))
            .collect(Collectors.toList());

    List<String> ipAddresses = lines
            .stream()
            .map(line -> line.replaceAll("[ ]+", " ").split(" ")[2])
            .collect(Collectors.toList());

    for (int i = 0; i < ipAddresses.size(); i++) {
        if (!ipAppearanceCount.containsKey(ipAddresses.get(i))){
            ipAppearanceCount.put(ipAddresses.get(i), (long) 1);
        }else{
            ipAppearanceCount.put(ipAddresses.get(i),
ipAppearanceCount.get(ipAddresses.get(i))+1);
        }
    }

    Comparator<? super Entry<String, Long>> maxValueComparator =  
Comparator.comparing(Entry::getValue);

    Optional<Entry<String, Long>> maxValue = ipAppearanceCount.entrySet()
            .stream().max(maxValueComparator);

    return String.valueOf(maxValue);
}

Here is the log file

  • So do it, what's the problem? – Meo Mar 17 '18 at 20:48
  • My teacher said I should use more streams but I can’t find the place where I can replace something with them –  Mar 17 '18 at 21:05
  • https://stackoverflow.com/questions/23276407/how-to-read-from-files-with-files-lines-foreach – Meo Mar 17 '18 at 21:44

1 Answers1

0

Streams are not usually not intended to be collected after each step. Instead chain the the next operation.

And for most operations there already exists some useful collector. The final "one-liner" is this:

private static String mostCommonIp(String fileName) throws IOException {
    try (Stream<String> stream = Files.lines(Paths.get(fileName))) {
        return stream
            // Get the IP as String. Your code. I hope this works for you.
            .map(line -> line.replaceAll("[ ]+", " ").split(" ")[2])
            // And now we group by the IPs, so we get a Map<String,Long>
            .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
            // Now we convert that back to a stream
            .entrySet().stream()
            // And find the entry with the highest number.
            .max(Comparator.comparing(Entry::getValue))
            // Get the actual IP out of it
            .map(Entry::getKey)
            // and return the IP with the highest count - or null if not found.
            .orElse(null);
    }
}
Johannes Kuhn
  • 14,778
  • 4
  • 49
  • 73
  • A small test with your logfile suggests a 85% speed improvement when using `.parallel()` before the first map. – Johannes Kuhn Mar 20 '18 at 19:51
  • what is "funktion.identity()" –  Mar 20 '18 at 20:34
  • Returns the [identity function](https://en.wikipedia.org/wiki/Identity_function). Basically a function that returns its argument without doing something. See also the [JavaDoc](https://docs.oracle.com/javase/8/docs/api/java/util/function/Function.html#identity--) – Johannes Kuhn Mar 20 '18 at 20:45