107

I have a Pandas DataFrame with two columns – one with the filename and one with the hour in which it was generated:

 File       Hour
  F1         1
  F1         2
  F2         1
  F3         1

I am trying to convert it to a JSON file with the following format:

{"File":"F1","Hour":"1"} 
{"File":"F1","Hour":"2"}
{"File":"F2","Hour":"1"}
{"File":"F3","Hour":"1"}

When I use the command DataFrame.to_json(orient = "records"), I get the records in the below format:

[{"File":"F1","Hour":"1"},
 {"File":"F1","Hour":"2"},
 {"File":"F2","Hour":"1"},
 {"File":"F3","Hour":"1"}]

I'm just wondering whether there is an option to get the JSON file in the desired format. Any help would be appreciated.

MarredCheese
  • 17,541
  • 8
  • 92
  • 91
user3447653
  • 3,968
  • 12
  • 58
  • 100
  • 25
    What you want isn't proper `json`. What you get from `dataframe.to_json(orient = “records”)` is proper `json` – piRSquared Aug 31 '16 at 19:22

9 Answers9

92

The output that you get after DF.to_json is a string. So, you can simply slice it according to your requirement and remove the commas from it too.

out = df.to_json(orient='records')[1:-1].replace('},{', '} {')

To write the output to a text file, you could do:

with open('file_name.txt', 'w') as f:
    f.write(out)
Nickil Maveli
  • 29,155
  • 8
  • 82
  • 85
  • This removes the commas between two key values pairs : {"ServerGroup":"Map""Hour":0}. I need it to be {"ServerGroup":"Map","Hour":0}. – user3447653 Aug 31 '16 at 19:14
  • Yeah, that was my mistake. Please try the modified one. – Nickil Maveli Aug 31 '16 at 19:21
  • Thanks, that works fine. Initially I used to convert the dataframe to a json file and store it in a local file using "df1.to_json(orient='records',path_or_buf='/content/tmp/GoogleCount.json')[1:-1].replace('},{','} {')". But now after the replace statement, I get the error "'NoneType' object has no attribute '__getitem__'" – user3447653 Aug 31 '16 at 19:27
  • That's because it's no longer a `json` formatted object but rather a `json-string`. Try passing `path_or_buf=None` as the argument instead. – Nickil Maveli Aug 31 '16 at 19:35
  • Ok Thanks. But I need to save this in a file, just wondering is there a way to achieve it – user3447653 Aug 31 '16 at 19:38
  • AttributeError: 'dict' object has no attribute 'to_json' – Mona Jalal Sep 25 '17 at 23:01
  • Instead of writing the output , how can I view one record? The file is too huge to write out just for viewing content. – Murtaza Haji Jun 09 '20 at 03:50
88

In newer versions of pandas (0.20.0+, I believe), this can be done directly:

df.to_json('temp.json', orient='records', lines=True)

Direct compression is also possible:

df.to_json('temp.json.gz', orient='records', lines=True, compression='gzip')
Brad Solomon
  • 38,521
  • 31
  • 149
  • 235
  • 1
    best solution imho! – Aman Mar 18 '21 at 19:06
  • 2
    JEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEZ I had been stuck on this for so long and it was so dumb and you've saved me thanks so much (I couldn't use temporary CSVs for some reason), anyway, THANKS!! – Patrick Ruff May 13 '21 at 03:30
28

I think what the OP is looking for is:

with open('temp.json', 'w') as f:
    f.write(df.to_json(orient='records', lines=True))

This should do the trick.

AP.
  • 8,082
  • 2
  • 24
  • 33
sagarsar
  • 289
  • 4
  • 2
12

Try this one:

json.dumps(json.loads(df.to_json(orient="records")))
chandra sutrisno
  • 491
  • 4
  • 17
9

use this formula to convert a pandas DataFrame to a list of dictionaries :

import json
json_list = json.loads(json.dumps(list(DataFrame.T.to_dict().values())))
Amir.S
  • 719
  • 8
  • 15
4

convert data-frame to list of dictionary

list_dict = []

for index, row in list(df.iterrows()):
    list_dict.append(dict(row))

save file

with open("output.json", mode) as f:
    f.write("\n".join(str(item) for item in list_dict))
Hafiz Shehbaz Ali
  • 2,566
  • 25
  • 21
1

To transform a dataFrame in a real json (not a string) I use:

    from io import StringIO
    import json
    import DataFrame

    buff=StringIO()
    #df is your DataFrame
    df.to_json(path_or_buf=buff,orient='records')
    dfJson=json.loads(buff)
0

instead of using dataframe.to_json(orient = “records”) use dataframe.to_json(orient = “index”) my above code convert the dataframe into json format of dict like {index -> {column -> value}}

-2

Here is small utility class that converts JSON to DataFrame and back: Hope you find this helpful.

# -*- coding: utf-8 -*-
from pandas.io.json import json_normalize

class DFConverter:

    #Converts the input JSON to a DataFrame
    def convertToDF(self,dfJSON):
        return(json_normalize(dfJSON))

    #Converts the input DataFrame to JSON 
    def convertToJSON(self, df):
        resultJSON = df.to_json(orient='records')
        return(resultJSON)
Siva
  • 170
  • 1
  • 3
  • 11
  • 1
    Works, but actually just a wrapper for the same available oneliner functions. I think most of the cases this is unwarranted in this form. – MattSom May 13 '20 at 11:16