-1

So the json file I want to parse looks like this:

{
   "container_header_255_2013-12-31 16:00:45": {
  "fw_package_version": "255.255.255X255", 
  "start_timestamp": 1388534445, 
  "start_timestr": "2013-12-31 16:00:45", 
  "end_timestamp": 4294967295, 
  "end_timestr": "2106-02-06 22:28:15", 
  "length": 65535, 
  "product": "UNKNOWN", 
  "hw_version": "UNKNOWN"
   },
   "log_packet_debug_1388534445_2013-12-31 16:00:45": {
  "timestamp": 1388534445, 
  "timestr": "2013-12-31 16:00:45", 
  "log_level": "DBG", 
  "log_id": "0xC051", 
  "log_string": "DBG_STORAGE_LOG", 
  "file_name_line": "storage_data.c733", 
  "message": "Mark as Erasable: 231 238"
  },

sorry the indentation might be a little off. but anyways all the examples I have seen online include lists and for some reason this one only contains dictionarys

Davoud Taghawi-Nejad
  • 16,142
  • 12
  • 62
  • 82
Brian Crafton
  • 438
  • 2
  • 6
  • 15
  • import json v = json.loads(your_json_text) However, I am guessing here and this ""message": "Mark as Erasable: 231 238" should be a list. Its not valid json and you will have to either write your own parser or format it as valid json: { ... "message": {"Mark as Erasable": [231, 238]} } – Davoud Taghawi-Nejad Jul 22 '15 at 18:39
  • I thought that was the problem, I however did not create the json file so Ill just have to parse it manually. – Brian Crafton Jul 22 '15 at 18:41
  • 1
    Should be reopend as how to parse an invalid json file – Davoud Taghawi-Nejad Jul 22 '15 at 18:42
  • I don't understand, what's invalid about your json text? Other than it not ending (which I'm guessing is an artifact of you snipping part of the text for the question, not an actual feature of your data), the text you've shown is valid. – Blckknght Jul 22 '15 at 21:56
  • @Blckknght Actually, I think this is the essence of the OP:s problem. This is a quite common problem with logfiles, for example. – Krumelur Jul 22 '15 at 22:34
  • @Krumelur: Ah, I see. That wasn't clear from the question. – Blckknght Jul 23 '15 at 05:00

1 Answers1

2

You can use the splitstream module (disclaimer: I wrote it) for this (pip install splitstream). It has a parameter startdepth which is specifically designed to parse XML/JSON streams that has not yet been terminated or are "infinite" (such as log files).

from splitstream import splitfile
from StringIO import StringIO
import json

jsonfile = StringIO(""".....""") # your neverending JSON-sorta logfile
# Probably, you want something like this instead
#jsonfile = file("/var/log/my/log.json", "r")

# startdepth is the magic argument here: it starts splitting at depth = 1
for s in splitfile(jsonfile, format="json", startdepth=1):
  print "JSON",json.loads(s)

Which gives:

JSON {u'start_timestamp': 1388534445, u'hw_version': u'UNKNOWN', u'fw_package_version': u'255.255.255X255', u'product': u'UNKNOWN', u'end_timestr': u'2106-02-06 22:28:15', u'length': 65535, u'start_timestr': u'2013-12-31 16:00:45', u'end_timestamp': 4294967295}
JSON {u'file_name_line': u'storage_data.c733', u'log_level': u'DBG', u'log_id': u'0xC051', u'timestamp': 1388534445, u'timestr': u'2013-12-31 16:00:45', u'log_string': u'DBG_STORAGE_LOG', u'message': u'Mark as Erasable: 231 238'}
Krumelur
  • 31,081
  • 7
  • 77
  • 119