30

I'm using jq to parse some of my logs, but some of the log lines can't be parsed for various reasons. Is there a way to have jq ignore those lines? I can't seem to find a solution. I tried to use the --seq argument that was recommended by some people, but --seq ignores all the lines in my file.

peak
  • 105,803
  • 17
  • 152
  • 177
Brandon
  • 2,886
  • 3
  • 29
  • 44
  • 1
    The question is not about the individual lines which can't be parsed. It's about ignoring or bypassing the lines with issues. Imagine have a line that is just "123". – Brandon Jan 11 '17 at 19:43

5 Answers5

55

Assuming that each log entry is exactly one line, you can use the -R or --raw-input option to tell jq to leave the lines unparsed, after which you can prepend fromjson? | to your filter to make jq try to parse each line as JSON and throw away the ones that error.

jwodder
  • 54,758
  • 12
  • 108
  • 124
  • Oh, that's interesting. I'll give that a shot. Thanks! – Brandon Jan 11 '17 at 19:47
  • Worked! Thanks a lot! – Brandon Jan 11 '17 at 20:03
  • 24
    It took me a while to understand since I don't know much jq but this was just was I was looking for to remove some bad records from a file. For others like me, a way to use would be: ```cat file_to_clean.jsonl | jq -R "fromjson? | . " -c > clean_file.jsonl``` – steven2308 Apr 14 '20 at 09:11
  • 4
    If you want to convert unparsable lines to strings (like I needed to), you can prepend your script with `(. as $line | try fromjson catch $line) |`. – Bill Burdick Aug 09 '21 at 13:21
  • 1
    I found `tail -f log | jq -R 'try fromjson catch .'` enough for most cases – Dennis C Apr 04 '22 at 08:49
  • I'm using this for `kubectl` tailing logs for those pods spitting out JSON on std-out. It actually prints the fields that I need, while omitting others. Adapt to your needs :-) `kubectl logs -f --tail 200 | jq -R 'fromjson? | . | "\(.["@timestamp"]) -- \(.level) -- \(.logger_name) -- \(.message) \(.stack_trace)"' -r` – muelleth Apr 03 '23 at 20:08
  • I wanted to run a script on the JSON lines and echo the rest as-is, this was what did it for me: `jq --raw-input '. as $line | try (fromjson | MY-JQ-SCRIPT) catch $line'` – afarah May 17 '23 at 13:08
12

I have log stream where some messages are in json format. I want to pipe the json messages through jq, and just echo the rest.

The json messages are on a single line.

Solution: use grep and tee to split the lines in two streams, those starting with "^{" pipe through jq and the rest just echo to terminal.

kubectl logs -f web-svjkn | tee >(grep -v "^{") | grep "^{" | jq .

or

cat logs | tee >(grep -v "^{") | grep "^{" | jq .

Explanation: tee generates 2nd stream, and grep -v prints non json info, 2nd grep only pipes what looks like json opening bracket to jq.

Pieter
  • 1,916
  • 17
  • 17
  • This only checks for lines starting with {, that doesn't mean they are valid JSON. How about this line: `{{`? – Steven Roose Nov 02 '20 at 12:35
  • 2
    @StevenRoose yes, and even "{" followed by non json, but for log parsing where you either have lines that are json, or normal log lines that start with a date, not curly bracket, and filtering on first character is enough to split stream and pass only the json lines through jq. – Pieter Nov 08 '20 at 07:36
9

This is an old thread, but here's another solution fully in jq. This allows you to both process proper json lines and also print out non-json lines:

jq -R '. as $line | try (fromjson) catch $line'

If you need to do additional jq processing:

jq -R . as $line | try (fromjson | <further processing for proper json lines>) catch $line'
admdrew
  • 3,790
  • 4
  • 27
  • 39
Cristi
  • 2,739
  • 1
  • 13
  • 8
  • 4
    This was great. Adapted it a bit so it works fully after a simple copy/paste: `jq -R '. as $line | try (fromjson) catch $line'` – 203 Jan 28 '22 at 14:01
1

There are several Q&As on the FAQ page dealing with the topic of "invalid JSON", but see in particular the Q:

Is there a way to have jq keep going after it hits an error in the input file?

In particular, this shows how to use --seq.

However, from the the sparse details you've given (SO recommends a minimal example be given), it would seem it might be better simply to use inputs. The idea is to process one JSON entity at a time, using "try/catch", e.g.

def handle: inputs | [., "length is \(length)"] ;
def process: try handle catch ("Failed", process) ;
process  

Don't forget to use the -n option when invoking jq.

See also Processing not-quite-valid JSON.

Community
  • 1
  • 1
peak
  • 105,803
  • 17
  • 152
  • 177
  • 1
    this seems to be using a lib, not the command-line `jq` – MarkHu Oct 05 '22 at 23:16
  • @MarkHu - Nope. You can add your own functions using command-line jq (or command-line gojq), though typically you'd read the jq program using the -f command-line option. – peak Oct 06 '22 at 02:33
-1

If JSON in curly braces {}:

grep -Pzo '\{(?>[^\{\}]|(?R))*\}' | jq 'objects'

If JSON in square brackets []:

grep -Pzo '\[(?>[^\[\]]|(?R))*\]' | jq 'arrays'

This works if there are no []{} in non-JSON lines.

James Risner
  • 5,451
  • 11
  • 25
  • 47