Python parsing JSON part of line in filePython

Question

I would like to ask you for help with my json parsing. I've got file where every line looks like this one:

some hexadecimal numbers|something else|int|UA info|{'computer': {'os': {'version': 'blabla', 'name': 'blabla'}, 'app': {'version': 'blabla', 'name': 'blabla'}}}

I've got code which split every line into parts:

for line in some_file:
    line2 = line.split('|')

and I wanna to take the last part of each line (which should be in json format, at least i think so) and parse it for future use (I mean that I wanna write (to another file) os=name version, app=name version). I tried something like this:

json_string = json.loads(line2[4])

but python tells me some errors:

Expecting property name: line 1 column 2 (char 1)

or

No JSON object could be decoded

I know that it's something stupid, but i don't know what to do... I would appreciate any advice.

score 0 · Answer 1 · edited May 23 '17 at 11:52

JSON requires double quotes for strings. Which means that you cannot load it as is with json.

I would use csv to parse the pipe-delimited file and ast.literal_eval() to safely load the last column values into Python dictionaries:

import csv
from ast import literal_eval


with open("file.csv") as f:
    reader = csv.reader(f, delimiter="|")
    data = [literal_eval(line[-1]) for line in reader]

print(data)  # data contains a list of dictionaries now

score 0 · Accepted Answer · answered Dec 12 '15 at 15:44

0

It's not an JSON, but looks like python literal.

You can use ast.literal_eval to convert to python object:

import ast
ast.literal_eval(lines2[4])

BTW, there's one missing close bracket in the question, so:

ast.literal_eval(lines2[4] + '}')

answered Dec 12 '15 at 15:44

falsetru

357,413
63
732
636

And after that? How can I access the individual parts? – Geofre Dec 12 '15 at 16:05

Cyrbil · Answer 3 · 2015-12-12T16:27:56.480

JSON requires double quotes for any string literal.

A string is a sequence of zero or more Unicode characters, wrapped in double quotes, using backslash escapes. - http://www.json.org/

One easy and safe way to get it parsed is to use any YAML parser.
YAML can parse JSON and is less strict on the syntax.

>>> import yaml # from package pyyaml
>>> yaml.load("{'test': 'ok'}")
{'test': 'ok'}
>>> data = yaml.load("{'computer': {'os': {'version': 'blabla', 'name': 'blabla'}, 'app': {'version': 'blabla', 'name': 'blabla'}}}")
>>> data.get('computer').get('app').get('version')
'blabla'

And for the pipe delimited data, you can split them like you do, or use csv module to do it. Bonus, you can pass every chunk of data to yaml.load and it will handle the conversion:

import csv
import StringIO

some_file = StringIO.StringIO("0x1337|something else|12456789|UA info|{'computer': {'os': {'version': 'blabla', 'name': 'blabla'}, 'app': {'version': 'blabla', 'name': 'blabla'}}}")
elements = csv.reader(some_file, delimiter="|")
for element in elements[0]:
    print(yaml.load(element))

Output:

4919
something else
12456789
UA info
{'computer': {'app': {'version': 'blabla', 'name': 'blabla'}, 'os': {'version': 'blabla', 'name': 'blabla'}}}

Python parsing JSON part of line in filePython

3 Answers3