0

I have 100s of similar json files and I want to save the contents of these json files into one single csv file. This is the code I wrote for the same. But it's not doing what I want to do. Desired output is csv file: https://drive.google.com/file/d/1cgwdbnvETLf6nO1tNnH0F_-fLxUOdT7L/view?usp=sharing

Please tell me what can be done to get the above output? Thanks JSON file format: https://drive.google.com/file/d/1-OZYrfUtDJmwcRUjpBgn59zJt5MjtmWt/view?usp=sharing

list_=['politifact13565', 'politifact13601'] 
for i in list_:
    with open("{}/news content.json".format(i)) as json_input:
        json_data = json.load(json_input, strict=False)
        mydict = {}
        mydict["url"] = json_data["url"]
        mydict["text"] = json_data["text"]
        mydict["images"]=json_data["images"]
        mydict["title"]=json_data["title"]
        df = pd.DataFrame.from_dict(mydict, orient='index')
        df = df.T
        df.append(df, ignore_index=True)
        df.to_csv('out.csv')
        print(df)

SOLVED:

list_=['politifact13565', 'politifact13601'] 
for i in list_:
    with open("{}/news content.json".format(i)) as json_input:
        json_data = json.load(json_input, strict=False)
        mydict = {}
        mydict["url"] = json_data["url"]
        mydict["text"] = json_data["text"]
        mydict["images"]=json_data["images"]
        mydict["title"]=json_data["title"]
        df = pd.DataFrame.from_dict(mydict, orient='index')
        df = df.T
        df.append(df, ignore_index=True)
        df.to_csv('out.csv', mode='a', header=False)
        print(df)
  • Please post a example of one of those jsons so we can copy and try on our own. Screenshots are not so helpful. – Andre S. Oct 29 '20 at 09:23

2 Answers2

0

Your solution is quite close to the desired output, you just need to transpose the imported json:

import glob
directory = "your/path/to/jsons/*.json"
df = pd.concat([pd.read_json(f, orient="index").T for f in glob.glob(directory)], ignore_index=True)

Aferwards you can save the df using df.to_csv("tweets.csv")
Hopefully that helps you!

Andre S.
  • 478
  • 4
  • 13
0
list_=['politifact13565', 'politifact13601'] 
for i in list_:
    with open("{}/news content.json".format(i)) as json_input:
        json_data = json.load(json_input, strict=False)
        mydict = {}
        mydict["url"] = json_data["url"]
        mydict["text"] = json_data["text"]
        mydict["images"]=json_data["images"]
        mydict["title"]=json_data["title"]
        df = pd.DataFrame.from_dict(mydict, orient='index')
        df = df.T
        df.append(df, ignore_index=True)
        df.to_csv('out.csv', mode='a', header=False)
        print(df)