18

I have several logs containing lines all starting with a timestamp, so that the following works as expected to merge them:

cat myLog1.txt myLog2.txt | sort -n > combined.txt

Problem is, that myLog2.txt can also contain lines without a timestamp (e.g. java stack traces). Is there an easy way without any custom scripts to still merge them and preserve the multiline content?

Example myLog1.txt

11:48:18.825 [main] INFO  org.hibernate.cfg.Environment - HHH000206: hibernate.properties not found
11:48:55.784 [main] INFO  o.h.tool.hbm2ddl.SchemaUpdate - HHH000396: Updating schema

Example myLog2.txt

11:48:35.377 [qtp1484319352-19] ERROR c.w.b.c.ControllerErrorHandler -
org.springframework.beans.TypeMismatchException: Failed to convert value of type   'java.lang.String' to required type 'org.joda.time.LocalDate'; nested exception is    org.springframework.core.convert.ConversionFailedException: Failed to convert from type     java.lang.String to type @org.springframework.web.bind.annotation.RequestParam   @org.springframework.format.annotation.DateTimeFormat org.joda.time.LocalDate for value    '[2013-03-26]'; nested exception is java.lang.IllegalArgumentException: Invalid format: "    [2013-03-26]"
    at org.springframework.beans.TypeConverterSupport.doConvert(TypeConverterSupport.java:68) ~[spring-beans-3.2.1.RELEASE.jar:3.2.1.RELEASE]
at org.springframework.beans.TypeConverterSupport.convertIfNecessary(TypeConverterSupport.java:45) ~[spring-beans-3.2.1.RELEASE.jar:3.2.1.RELEASE]
at org.springframework.validation.DataBinder.convertIfNecessary(DataBinder.java:595) ~[spring-context-3.2.1.RELEASE.jar:3.2.1.RELEASE]
at org.springframework.web.method.annotation.AbstractNamedValueMethodArgumentResolver.resolveArgument(AbstractNamedValueMethodArgumentResolver.java:98) ~[spring-web-3.2.1.RELEASE.jar:3.2.1.RELEASE]
at org.springframework.web.method.support.HandlerMethodArgumentResolverComposite.resolveArgument(HandlerMethodArgumentResolverComposite.java:77) ~[spring-web-3.2.1.RELEASE.jar:3.2.1.RELEASE]
at org.springframework.web.method.support.InvocableHandlerMethod.getMethodArgumentValues(InvocableHandlerMethod.java:162) ~[spring-web-3.2.1.RELEAS

Expected output

11:48:18.825 [main] INFO  org.hibernate.cfg.Environment - HHH000206: hibernate.properties not found
11:48:35.377 [qtp1484319352-19] ERROR c.w.b.c.ControllerErrorHandler -
org.springframework.beans.TypeMismatchException: Failed to convert value of type   'java.lang.String' to required type 'org.joda.time.LocalDate'; nested exception is    org.springframework.core.convert.ConversionFailedException: Failed to convert from type     java.lang.String to type @org.springframework.web.bind.annotation.RequestParam   @org.springframework.format.annotation.DateTimeFormat org.joda.time.LocalDate for value    '[2013-03-26]'; nested exception is java.lang.IllegalArgumentException: Invalid format: "    [2013-03-26]"
    at org.springframework.beans.TypeConverterSupport.doConvert(TypeConverterSupport.java:68) ~[spring-beans-3.2.1.RELEASE.jar:3.2.1.RELEASE]
at org.springframework.beans.TypeConverterSupport.convertIfNecessary(TypeConverterSupport.java:45) ~[spring-beans-3.2.1.RELEASE.jar:3.2.1.RELEASE]
at org.springframework.validation.DataBinder.convertIfNecessary(DataBinder.java:595) ~[spring-context-3.2.1.RELEASE.jar:3.2.1.RELEASE]
at org.springframework.web.method.annotation.AbstractNamedValueMethodArgumentResolver.resolveArgument(AbstractNamedValueMethodArgumentResolver.java:98) ~[spring-web-3.2.1.RELEASE.jar:3.2.1.RELEASE]
at org.springframework.web.method.support.HandlerMethodArgumentResolverComposite.resolveArgument(HandlerMethodArgumentResolverComposite.java:77) ~[spring-web-3.2.1.RELEASE.jar:3.2.1.RELEASE]
at org.springframework.web.method.support.InvocableHandlerMethod.getMethodArgumentValues(InvocableHandlerMethod.java:162) ~[spring-web-3.2.1.RELEAS
11:48:55.784 [main] INFO  o.h.tool.hbm2ddl.SchemaUpdate - HHH000396: Updating schema

Thanks Marco

Marco Behler
  • 3,627
  • 2
  • 17
  • 19

6 Answers6

28

I was struggling with the same issue and finally I think I've got it. Try do it like:

sort -nbms -k1.1,1.2 -k1.4,1.5 -k1.7,1.8 -k1.10,1.12 myLog1.txt myLog2.txt > combined.txt

It's still not fully clear to myself, I'll try to give some explanation though. According to the man pages used switches mean:

-n, --numeric-sort - compare according to string numerical value.

-b, --ignore-leading-blanks - ignore leading blanks.

-s, --stable - stabilize sort by disabling last-resort comparison

-m, --merge - merge already sorted files; do not sort

-k, --key=POS1[,POS2] - start a key at POS1 (origin 1), end it at POS2 (default end of line)

  • log files are already ordered so we don't need to sort them again, only determine which line goes where upon merging. That's why -m. It's crucial to keep stacktraces from getting scrambled.
  • -b is not necessary in this case as somehow -n and -m combined keeps stacktrace lines from getting clustered. I left it just in case as most of stacktrace lines starts with blanks.
  • -n apparently stops comparing key whenever there is a non-numeric character in the key. That's the second crucial bit for keeping stacktraces in place. Important is if it was -n -k1,1 it would only sort the log files by hour as colon is non-numeric. Apart from that -n speeds up numeric comparison so we would like to have it anyway.
  • the problem mentioned in the previous point is solved by pointing to specific characters positions in each key, that's why -k1.1,1.2 (first and second digit of hour) -k1.4,1.5 (first and second digit of minutes) and so on. The first digit before the dot is always '1' as it points to the first column of the file line (which in our case is time). Shortly it's -kA,B where A and B are column positions in a given line (by default lines are delimited by blanks). Format of A and B used is .. Keep in mind that whenever there is a non-numeric character between A and B everything after it will be ignored in comparison if -n used.
  • -s disables default behaviour which is: whenever keys by which comparison is being done are the same full string comparison of the lines is done. We don't want that to preserve original log entries order. Not sure if it's necessary with -m though.
Community
  • 1
  • 1
topr
  • 4,482
  • 3
  • 28
  • 35
  • Well done @topr! If your log format is not a simple numerical sort, you do have to wrestle with the -k flag. I found it simpler to modify the timestamp to an easily sortable datetime format (such as yyyyMMddHHmmssSSS), but it _is_ harder to read. – yegeniy Nov 26 '14 at 16:30
  • Any idea on how to preserve stack traces here? Using ```sort -nbs -k1.1,1.4 -k1.6,1.7 -k1.9,1.10 test.log``` which sorts this bit of the log lines correctly ```2021-09-13``` but then if there are stack traces with no dates it throws them to the top of the output and not inline where they belong. – Andrew Sep 13 '21 at 23:50
1

Nope - can't be done with a simple command IMMHO.

But - here's a script to do it (it was a challenge...)

@ECHO OFF
SETLOCAL
:: First log to tempfile
COPY /y mylog.txt "%temp%\combinedlogs.tmp" >NUL
(
FOR /f "delims=" %%i IN (mylog2.txt) DO (
 SET line=%%i
 ECHO %%i|FINDSTR /b /r "[012][0-9]:[0-5][0-9]:[0-5][0-9]\.[0-9][0-9][0-9]" >NUL
 IF ERRORLEVEL 1 (
  SETLOCAL ENABLEDELAYEDEXPANSION
 ECHO(!stamp:~0,12!!count!!line!
  ENDLOCAL
  SET /a count+=1
 ) ELSE (
 SET /a count=100
 ECHO %%i
 SET stamp=%%i
 )
)
)>>"%temp%\combinedlogs.tmp"
(
FOR /f "delims=" %%i IN ('SORT "%temp%\combinedlogs.tmp"') DO (
 SET line=%%i
 SETLOCAL enabledelayedexpansion
 IF "!line:~12,1!"==" " (ECHO(%%i) ELSE (ECHO(!line:~15!)
 ENDLOCAL
)
)>combinedlogs.txt
DEL "%temp%\combinedlogs.tmp" /F /Q

Copy the first log with all-timestamped entries to a tempfile
Process the second file by

  • outputting any timestamped line directly, saving the stamp line and setting a 3-digit counter
  • Outputting the stamp portion+counter+originaltext for other lines and bumping the counter

Tempfile thus is

Timestamp1 line1 from file1
..
Timestampn linen from file1
timestampA line1 from file2 with timestamp
timestampA100 UNtimestamped line2from file2
timestampA101 UNtimestamped line3from file2
timestampB line4 from file2 with timestamp
timestampB100 UNtimestamped line5from file2
timestampB101 UNtimestamped line6from file2
...

Sorting the result and reprocessing
A line with a non-space in the 13th character is an untimestamped line from the second file, so

  • output all but the the first 15 chars (timestamp 12 chars + 3 for counter)
  • otherwise, timestamped line, so output all.

Done!

Magoo
  • 77,302
  • 8
  • 62
  • 84
1

Here's one way to do it in a bash shell with simple merging of the files (rather than expensive resorting - as log files are already sorted). This is important for huge files in the hundreds of megabytes or more, as often is the case with real world log files.

This solution assumes that there are no NUL bytes in your logs, which is true for every log file that I've come across, with various character sets.

The basic idea:

  1. Concat all multilines to single lines by replacing those newlines by NUL in each input file
  2. Do a sort -m on the replaced files to merge them
  3. Replace NUL back to newlines on the merged result

As the first step is done multiple times, I've given it an alias:

alias a="awk '{ if (match(\$0, /^[0-9]{2}:[0-9]{2}:[0-9]{2}\\./, _))\
{ if (NR == 1) printf \"%s\", \$0; else printf \"\\n%s\", \$0 }\
else printf \"\\0%s\", \$0 } END { print \"\" }'"

Here's the command that performs all 3 steps:

sort -m <(a myLog1.txt) <(a myLog2.txt) | tr '\0' '\n'

For more, see https://superuser.com/a/838446/125379

Evgeniy Berezovsky
  • 18,571
  • 13
  • 82
  • 156
1

The open source tool (Java GitHub) lets you combine log files with different formats including multilines into a merged file.

The tool allows to shift the time of records in a log file. It can be useful when files come from different time zones.

It also allows to generate additional information to a merged file, for example applications names or timestamps in a uniform format, see the example.

The tool can be used as a command line tool or Java library. Note: I'm the author.

Kyrylo Semenko
  • 866
  • 9
  • 11
1

Super Speedy Syslog Searcher can sort log messages by datetimestamp. If you can change the logging format to include the calendar date then it will work.

(assuming you have rust installed)

cargo install super_speedy_syslog_searcher

then

s4 myLog1.txt myLog2.txt > combined.txt
JamesThomasMoon
  • 6,169
  • 7
  • 37
  • 63
0

You should use the merge, stable, ignore-leading-blanks, numeric-sort, and an easily sortable datetime format (such as yyyyMMddHHmmssSSS) in your log files.

So, I changed your log format to be more easily sortable, resulting in sort -bsnm log1 log2:

 $ cat -n log1 log2 && sort -m -b -n -s log1 log2
      1 114818825 [main] INFO  org.hibernate.cfg.Environment - HHH000206 hibernate.properties not found
      2 114855784 [main] INFO  o.h.tool.hbm2ddl.SchemaUpdate - HHH000396 Updating schema
      1 114835377 [qtp1484319352-19] ERROR c.w.b.c.ControllerErrorHandler -
      2 org.springframework.beans.TypeMismatchException Failed to convert value of type   'java.lang.String' to required type 'org.joda.time.LocalDate'; nested exception is    org.springframework.core.convert.ConversionFailedException Failed to convert from type     java.lang.String to type @org.springframework.web.bind.annotation.RequestParam   @org.springframework.format.annotation.DateTimeFormat org.joda.time.LocalDate for value    '[2013-03-26]'; nested exception is java.lang.IllegalArgumentException Invalid format "    [2013-03-26]"
      3     at org.springframework.beans.TypeConverterSupport.doConvert(TypeConverterSupport.java68) ~[spring-beans-3.2.1.RELEASE.jar3.2.1.RELEASE]
      4 at org.springframework.beans.TypeConverterSupport.convertIfNecessary(TypeConverterSupport.java45) ~[spring-beans-3.2.1.RELEASE.jar3.2.1.RELEASE]
      5 at org.springframework.validation.DataBinder.convertIfNecessary(DataBinder.java595) ~[spring-context-3.2.1.RELEASE.jar3.2.1.RELEASE]
      6 at org.springframework.web.method.annotation.AbstractNamedValueMethodArgumentResolver.resolveArgument(AbstractNamedValueMethodArgumentResolver.java98) ~[spring-web-3.2.1.RELEASE.jar3.2.1.RELEASE]
      7 at org.springframework.web.method.support.HandlerMethodArgumentResolverComposite.resolveArgument(HandlerMethodArgumentResolverComposite.java77) ~[spring-web-3.2.1.RELEASE.jar3.2.1.RELEASE]
      8 at org.springframework.web.method.support.InvocableHandlerMethod.getMethodArgumentValues(InvocableHandlerMethod.java162) ~[spring-web-3.2.1.RELEAS
      9 
 114818825 [main] INFO  org.hibernate.cfg.Environment - HHH000206 hibernate.properties not found
 114835377 [qtp1484319352-19] ERROR c.w.b.c.ControllerErrorHandler -
 org.springframework.beans.TypeMismatchException Failed to convert value of type   'java.lang.String' to required type 'org.joda.time.LocalDate'; nested exception is    org.springframework.core.convert.ConversionFailedException Failed to convert from type     java.lang.String to type @org.springframework.web.bind.annotation.RequestParam   @org.springframework.format.annotation.DateTimeFormat org.joda.time.LocalDate for value    '[2013-03-26]'; nested exception is java.lang.IllegalArgumentException Invalid format "    [2013-03-26]"
     at org.springframework.beans.TypeConverterSupport.doConvert(TypeConverterSupport.java68) ~[spring-beans-3.2.1.RELEASE.jar3.2.1.RELEASE]
 at org.springframework.beans.TypeConverterSupport.convertIfNecessary(TypeConverterSupport.java45) ~[spring-beans-3.2.1.RELEASE.jar3.2.1.RELEASE]
 at org.springframework.validation.DataBinder.convertIfNecessary(DataBinder.java595) ~[spring-context-3.2.1.RELEASE.jar3.2.1.RELEASE]
 at org.springframework.web.method.annotation.AbstractNamedValueMethodArgumentResolver.resolveArgument(AbstractNamedValueMethodArgumentResolver.java98) ~[spring-web-3.2.1.RELEASE.jar3.2.1.RELEASE]
 at org.springframework.web.method.support.HandlerMethodArgumentResolverComposite.resolveArgument(HandlerMethodArgumentResolverComposite.java77) ~[spring-web-3.2.1.RELEASE.jar3.2.1.RELEASE]
 at org.springframework.web.method.support.InvocableHandlerMethod.getMethodArgumentValues(InvocableHandlerMethod.java162) ~[spring-web-3.2.1.RELEAS

 114855784 [main] INFO  o.h.tool.hbm2ddl.SchemaUpdate - HHH000396 Updating schema

As said in @Magoo's answer, the way your logs' datetime is currently formatted is hard to sort.

Community
  • 1
  • 1
yegeniy
  • 1,272
  • 13
  • 28
  • This will screw multi-line log entries by 'sorting' away stacktraces, won't it? – topr Nov 24 '14 at 19:47
  • Hi @topr, been a while, but from what I remember seeing, multi-line stacktraces are preserved with `sort -bsnm log1 log2...`. I wouldn't be surprised if there were corner cases, but for the example @marco provided and my own usage this worked surprising well. There is a variety of log viewer tools out there if you need something more sophisticated though. – yegeniy Nov 25 '14 at 04:21
  • Thanks for your answer. Stacktraces are preserved but with those switches all the lines are preserved anyway. There is no sorting at all, just merging. I'm using log viewing tool but my need for merging with sort is I'm having multiple log files and it's much more convenient to feed the tool with a single file. – topr Nov 25 '14 at 17:41
  • Hi @topr, Consider writing your own stackoverflow question with an example of what you're looking for. At least for the example in this question it should work. Keep in mind that each line must start with a sortable numerical representation of the time. Keep in mind that the logs should start with an easily sortable datetime format (such as yyyyMMddHHmmssSSS). That is, log 1's log lines lines start with `114818825` and `114855784`, while log 2 starts with `114835377`. The sorted version comes out to `114818825`, `114835377`, `114855784`, since 18825<35377<55784. – yegeniy Nov 25 '14 at 21:43
  • Not sure if new question won't be a duplicate as it would be the same as this case. My log file lines start with date and time, like: 2014-11-25 06:43:20,991 INFO... That's a sortable format, isn't it? However there are also lines for log entries with stacktraces exactly as in the example above. Not sure what -m suppose to do but it disables sorting at all (no different from using cat). Without -m it sorts but cluster all lines for all stracktraces at the top which makes such sorting useless. I don't get your answer as it doesn't seem `sort` is able to sort log file with multiline entries. – topr Nov 26 '14 at 11:54