3

I'm creating a json file using jq. This is the output:

{
   "temperature":"21", 
   "humidity":"12.3", 
   "message":"Today ID 342 is running"
}
{
   "temperature":"13", 
   "humidity":"40.1", 
   "message":"Today ID 98 is running"
}

If i try to open this file using Python, it gives me errors unless i remove manually newlines and tabs like this:

{"temperature":"21","humidity":"12.3","message":"Today ID 342 is running"}
{"temperature":"13","humidity":"40.1","message":"Today ID 98 is running"}

I tried to use the -j option in jq, but nothing changed. Any suggestions? Also a solution which uses other programs is fine (sed etc). Thanks!!

alcor
  • 515
  • 1
  • 8
  • 21
  • Can your message have special characters like `"Who\tId\nAlcor\t342\n"` ? Or can you use `tr -d` and add a newline after each `}` when finished? – Walter A Aug 26 '19 at 11:47

2 Answers2

3

tabs, newlines or spaces within a json dict or list are absolutely ok.

The file is not a valid json document because it contains many json documents (dictionaries in this case) separated by newlines. The result of this is not a valid json document and can't be parsed by a strict json parser. At least not by the one which comes with Python's json library.

If you accept to pre-process the file with jq, you could put those objects into a list with the -s option:

jq -s . input.json > output.json
cat output.json
[
  {
    "temperature": "21",
    "humidity": "12.3",
    "message": "Today ID 342 is running"
  },
  {
    "temperature": "13",
    "humidity": "40.1",
    "message": "Today ID 98 is running"
  }
]

Then use json.load in Python:

import json

with open('output.json') as file_desc:
    measurements = json.load(file_desc)

Pure python solutions can be found here: How I can I lazily read multiple JSON values from a file/stream in Python?

hek2mgl
  • 152,036
  • 28
  • 249
  • 266
  • Thank you for your answer. So why does python recognizes the file if i remove manually the newlines, even without the list format? Is there a way with jq to delete all newlines, keeping the ones at the end of each json string? – alcor Aug 26 '19 at 12:57
  • Python will raise a JSONDecodeError if you attempt to `json.load()` a file with more than one dictionary in it. I've just double checked it. Not sure what you've exactly been doing, but just `jq -c` will not work – hek2mgl Aug 26 '19 at 15:24
  • I used `json.loads()` and it worked, i tried again now and it works. I'm reading the file line by line and store every single json in a list. – alcor Aug 27 '19 at 11:04
  • Ok, loading each document separately, line by line should work. If you use `jq` anyways already, you can also use `jq -s` as I showed and then load the whole list of documents in one run. – hek2mgl Aug 27 '19 at 11:08
0

Ok i solved the problem by disabling the defautl pretty print of jq. I added the -c option and now every json is on one line:

jq -c . file.json > file2.json
cat file2.json
{"temperature":"21","humidity":"12.3","message":"Today ID 342 is running"}
{"temperature":"13","humidity":"40.1","message":"Today ID 98 is running"}

I load them like this:

import json

measurements = []
with open('file2.json') as file_desc:
    for line in file_desc:
        # Treat each line as a separate document
        measurements.append(json.loads(line))
hek2mgl
  • 152,036
  • 28
  • 249
  • 266
alcor
  • 515
  • 1
  • 8
  • 21
  • I don't know why the answer is not useful, maybe the solution is not replicable. I'm looking for other working solutions. – alcor Aug 27 '19 at 11:05
  • I didn't down-vote the answer, but I think it was just unclear how you solved it. I've added that from the information in the comments below my answer. – hek2mgl Aug 27 '19 at 11:14
  • Generally the solution should work. I could imagine that my solution, with `jq -s`, is faster on large files, but I haven't tried that. – hek2mgl Aug 27 '19 at 11:18