0

I have some log files which contain mixed of JSON and non-JSON logs, I'd like to separate them into two files, one contains JSON logs only and the other contains non-JSON logs, I get some ideas from this to extract JSON logs with jq, here are what I have tried using tee to split log into two files (usage from here & here) and jq to extract logs:

cat $logfile | tee  >(jq -R -c 'fromjson? | select(type == "object") | not') > $plain_log_file) >(jq -R -c 'fromjson? | select(type == "object")' > $json_log_file)

This extracts JSON logs correctly but returns false for each non-JSON log instead of the log content itself.

cat $logfile | tee  >(jq -R -c 'try fromjson catch .') > $plain_log_file) >(jq -R -c 'fromjson? | select(type == "object")' > $json_log_file)

this gets jq syntax error "catch ."

I do this so I can view the logs in lnav (an excellent log view/navigation tool).

Any suggestion on how to achieve this? Appreciate your help!

sample input:

{ "name": "joe"}
text line, this can be multi-line too
{ "xyz": 123 }
tonywl
  • 131
  • 2
  • 10
  • That's a [useless `cat`](https://stackoverflow.com/questions/11710552/useless-use-of-cat) – tripleee Oct 27 '22 at 05:09
  • 1
    Do you really need a json parser for this? Wouldn't `grep '^{'` suffice to extract JSON lines? – oguz ismail Oct 27 '22 at 05:16
  • @oguzismail I already tried `grep '^{` but it does not handle some of the multiline non-JSON logs. – tonywl Oct 27 '22 at 05:20
  • @tripleee my script can get input from stdin or a file, in the former case `$logfile = "-"`. – tonywl Oct 27 '22 at 05:23
  • Then either explain that in the question itself, or remove this distraction. The lack of quoting is disturbing, too. – tripleee Oct 27 '22 at 05:26
  • Okay, consider including an excerpt from your log file then – oguz ismail Oct 27 '22 at 05:27
  • 2
    added sample input. – tonywl Oct 27 '22 at 05:30
  • I'm afraid this won't be possible if there are multiline JSON strings. There's no way to detect them with jq, because with `-R` each line is parsed as single string. Without `-R`, you have invalid JSON entities (the freeform text between JSON objects). Can you provide sample input with multiline log entries? With the input currently given, the solution is a trivial `grep '^{'`. Please make sure the sample accurately reflects the problem – knittl Oct 27 '22 at 08:34

3 Answers3

1

Assuming each JSON log item occurs on a separate line:

For the JSON logs:

jq -nR -c 'inputs|fromjson?'

For the others, you could use:

jq -nRr  'inputs | . as $in | try (fromjson|empty) catch $in'
peak
  • 105,803
  • 17
  • 152
  • 177
1

If you only want to linewise separate the input into different files, go with @peak's solution. But if you want to further process the lines on conditions, you could turn them into an array using -Rn and [inputs], and go from there. For instance, if you need the according line numbers (e.g. to feed them into another tool, e.g. sed), use from_entries which for arrays provides them in the .key field:

jq -Rn 'reduce ([inputs] | to_entries[]) as $in ({};
  .[($in.value | fromjson? | "json") // "plain"] += [$in.key]
)'
{
  "json": [
    0,
    2
  ],
  "plain": [
    1
  ]
}

Demo

pmf
  • 24,478
  • 2
  • 22
  • 31
1

If each JSON log entry can be spread over multiple lines, then some assumptions about the non-JSON log entries must be made. Here is an example based on reasonable assumptions about the non-JSON entries. A bash or bash-like environment is also assumed for the sake of convenience.

function log {
    cat<<EOF
{ "name": 
 "joe"}
text line, this can be 
multi-line too
{ 
"xyz": 123 }
EOF
}

log | sed '/^[^"{[ ]/ { s/"/\\"/g ; s/^/"/; s/$/"/;}' |
    tee >(jq -rc 'select(type == "string")' > strings.log) |
    jq -rc 'select(type != "string")' > json.log
peak
  • 105,803
  • 17
  • 152
  • 177
  • 1
    I see you turned the plain text into strings which `jq` can handle, very good idea! Yes, it's tricky to cover everything plain text can include, but what you have suggested here is good for many cases. thanks! – tonywl Oct 28 '22 at 04:41