1

I have a log file with the following layout

2018-01-01 01:01:00.000 Text
Text
Text

2018-01-01 01:02:00.000 Text
Text

2018-01-01 01:02:10.000

Each log event starts with a timestamp and ends with an empty line. I would like to process this with a stream or an iterator to get one String per event. What's an elegant way to do this? In Python I would iterate over the lines and yield the events once I find the empty line. I'm a bit lost how to do this in Java.

tobidope
  • 560
  • 3
  • 9
  • Stream and one String per log ? – azro Apr 07 '18 at 17:07
  • Or, one string per timestamp? – zlakad Apr 07 '18 at 17:08
  • @tobidope - I also found [a more general answer on SO](https://stackoverflow.com/a/39013901/2071828), which may or may not be useful to you. – Boris the Spider Apr 07 '18 at 18:06
  • Which version of java are you using? If 9+, the right tool for this is `Scanner.findAll` – fps Apr 07 '18 at 21:43
  • @FedericoPeraltaSchaffner possibly. But also `Scanner` has always been comically slow; `Scanner` with regex has the potentially be unusably slow - would need to be very careful with the `Pattern`. – Boris the Spider Apr 07 '18 at 23:43
  • @BoristheSpider Yes, maybe. We should measure before risking opinions. With `Scanner.findAll` the whole solution takes 3 lines, and that if you use a `try-with-resources` block... Compare that with adapting a stream's `Iterator` and turning that back to a stream. This would be my last resource, really. Besides, streams have an infrastructure overhead that would make using a regex with `Scanner` a viable choice. Ultimately, it will all depend on how good and efficient the regex is... – fps Apr 08 '18 at 01:42

1 Answers1

2

I would write it as an Iterator.

Get a Stream<String>, ask for its Iterator; then create a method that takes an Iterator<String> and returns a Iterator<LogEvent>.

Then turn that back into a Stream<LogEvent>.

Lets assume, for now, that LogEvent has something like this:

class LogEvent {

    static Builder builder() {
        return new Builder();
    }

    static class Builder {
        Builder appendLine(final String line) {
            //do stuff
            return this;
        }

        LogEvent build() {
            //validate?
            return new LogEvent();
        }
    }
}

Then something like this would work:

Iterator<LogEvent> toLogEvents(final Iterator<String> lineIterator) {
    return new Iterator<LogEvent>() {
        @Override
        public boolean hasNext() {
            return lineIterator.hasNext();
        }

        @Override
        public LogEvent next() {
            final LogEvent.Builder builder = LogEvent.builder();
            String line;
            while(lineIterator.hasNext() && !(line = lineIterator.next()).isEmpty()) {
                builder.appendLine(line);
            }
            return builder.build();
        }
    };
}

Now you can write a method:

Stream<LogEvent> toLogEvents(final Supplier<Stream<String>> fileReader) {
    final Stream<String> lines = fileReader.get();
    final Iterator<LogEvent> logEventIterator = toLogEvents(lines.iterator());
    return StreamSupport.stream(Spliterators.spliteratorUnknownSize(logEventIterator, Spliterator.ORDERED), false).onClose(() -> lines.close());
}
Boris the Spider
  • 59,842
  • 6
  • 106
  • 166