0

I have been trying to parse a JSON file and it keeps giving me additional data errors. Since I am new to Python, I have no idea how I can resolve this. It seems there are multiple objects within the file. How do I parse it without getting any errors?

Edit: (Not my code but I am trying to work on it)

import json
import csv
import io

 '''
creates a .csv file using a Twitter .json file
the fields have to be set manually
'''

data_json = io.open('filename', mode='r', encoding='utf-8').read() #reads in 
the JSON file
data_python = json.loads(data_json)

csv_out = io.open('filename', mode='w', encoding='utf-8') #opens csv file


fields = u'created_at,text,screen_name,followers,friends,rt,fav' #field 
names
csv_out.write(fields)
csv_out.write(u'\n')

for line in data_python:

#writes a row and gets the fields from the json object
#screen_name and followers/friends are found on the second level hence two 
get methods
row = [line.get('created_at'),
       '"' + line.get('text').replace('"','""') + '"', #creates double 
quotes
       line.get('user').get('screen_name'),
       unicode(line.get('user').get('followers_count')),
       unicode(line.get('user').get('friends_count')),
       unicode(line.get('retweet_count')),
       unicode(line.get('favorite_count'))]

row_joined = u','.join(row)
csv_out.write(row_joined)
csv_out.write(u'\n')




csv_out.close()

Edit 2: I found another recipe to parse it but there is no way for me to save the output. Any recommendations?

import json
import re

json_as_string = open('filename.json', 'r')
# Call this as a recursive function if your json is highly nested
lines = [re.sub("[\[\{\]]*", "", one_object.rstrip()) for one_object in 
json_as_string.readlines()]

json_as_list = "".join(lines).split('}')
for elem in json_as_list:
if len(elem) > 0:
    print(json.loads(json.dumps("{" + elem[::1] + "}")))
Ess Tee
  • 1
  • 3
  • Updated with code.Unable to parse the file properly. Need to be able to get a good output so I can analyse it. – Ess Tee Jul 13 '17 at 17:46
  • What version of python do you use? And could you show the traceback - without that it's hard to help you... – jlaur Jul 13 '17 at 19:24
  • Using python 2.7. Is it OK if I paste this tomorrow? Been at it since morning. Around one am here. – Ess Tee Jul 13 '17 at 19:59
  • Traceback (most recent call last): File "C:/Users/jest/PycharmProjects/untitled/token.py", line 14, in data_python = json.loads(data_json) File "C:\Python27\lib\json\__init__.py", line 310, in loads return _default_decoder.decode(s) File "C:\Python27\lib\json\decoder.py", line 349, in decode raise ValueError(errmsg("Extra data", s, end, len(s))) ValueError: Extra data: line 2 column 1 - line 1201 column 1 (char 13339 - 8801096) Process finished with exit code 1 – Ess Tee Jul 14 '17 at 05:42
  • So for loading the json read this: https://stackoverflow.com/a/20199213/8240959. For extracting data from json look at answers like this one: https://stackoverflow.com/a/28218931/8240959. – jlaur Jul 14 '17 at 07:30
  • There are tons of tutorials online on how to work with json in python. Google around and take some time getting the gradp of it. Look at this for instance: http://www.w3resource.com/JSON/python-json-module-tutorial.php. If you need help on extracting specific element from your tweet file, please show the json object that you need to extract from. Without that it's impossible to help you further. – jlaur Jul 14 '17 at 07:43
  • Would a link to an example JSON file help? – Ess Tee Jul 14 '17 at 11:54
  • Sure - if the structure in the example is identical to the one in your file... – jlaur Jul 16 '17 at 22:18

0 Answers0