There's no simple way in the standard library to convert that data format to JSON, so we need to write a parser. However, since the data format is fairly simple that's not hard to do. We can use the standard csv
module to read the data. The csv.reader
will handle the details of parsing spaces and quoted strings correctly. A quoted string will be treated as a single token, tokens consisting of a single word may be quoted but they don't need to be.
The csv.reader
normally gets its data from an open file, but it's quite versatile, and will also read its data from a list of strings. This is convenient while testing since we can embed our input data into the script.
We parse the data into a nested dictionary. A simple way to keep track of the nesting is to use a stack, and we can use a plain list as our stack.
The code below assumes that input lines can be one of three forms:
- Plain data. The line consists of a key - value pair, separated by at least one space.
- A new subobject. The line starts with a key and ends in an open brace
{
.
- The end of the current subobject. The line contains a single close brace
}
import csv
import json
raw = '''\
name {
first_name: "random"
}
addresses {
location {
locality: "India"
street_address: "xyz"
postal_code: "300092"
full_address: "street 1 , abc,India"
}
}
projects {
url: "www.githib.com"
}
'''.splitlines()
# A stack to hold the parsed objects
stack = [{}]
reader = csv.reader(raw, delimiter=' ', skipinitialspace=True)
for row in reader:
#print(row)
key = row[0]
if key == '}':
# The end of the current object
stack.pop()
continue
val = row[-1]
if val == '{':
# A new subobject
stack[-1][key] = d = {}
stack.append(d)
else:
# A line of plain data
stack[-1][key] = val
# Convert to JSON
out = json.dumps(stack[0], indent=4)
print(out)
output
{
"name": {
"first_name:": "random"
},
"addresses": {
"location": {
"locality:": "India",
"street_address:": "xyz",
"postal_code:": "300092",
"full_address:": "street 1 , abc,India"
}
},
"projects": {
"url:": "www.githib.com"
}
}