json file with duplicates keys to pandas df

Question

I'm trying to transform a json file to pandas df. This json file has duplicates keys.

Following the answer of this question: Python json parser allow duplicate keys, I tried to do:

from collections import OrderedDict
from json import JSONDecoder


def make_unique(key, dct):
    counter = 0
    unique_key = key

    while unique_key in dct:
        counter += 1
        unique_key = '{}_{}'.format(key, counter)
    return unique_key


def parse_object_pairs(pairs):
    dct = OrderedDict()
    for key, value in pairs:
        if key in dct:
            key = make_unique(key, dct)
        dct[key] = value

    return dct



decoder = JSONDecoder(object_pairs_hook=parse_object_pairs)

with open("file.json") as f:
obj = decoder.decode(f)
#print obj

I received the following error:

TypeError 
Traceback (most recent call last)
<ipython-input-70-0d2633348c10> in <module>()
  2 
  3 with open("file.json") as f:
 ----> 4     obj = decoder.decode(f)
  5     #print obj
  6 

C:\ProgramData\Anaconda2\lib\json\decoder.pyc in decode(self, s, _w)
362 
363         """
--> 364         obj, end = self.raw_decode(s, idx=_w(s, 0).end())
365         end = _w(s, end).end()
366         if end != len(s):

TypeError: expected string or buffer"

What am I missing?

`obj = decoder.decode(f.read())`? – BallpointBen May 09 '18 at 18:33 — BallpointBen, May 09 '18 at 18:33

score 0 · Accepted Answer · answered May 09 '18 at 18:35

0

The problem is that JSONDecoder.decode takes a string not a file. So you will need to pass in the full text of the file. That can be as simple as

with open('file.json') as f:
    obj = decoder.decode(f.read())

if the size of file.json is reasonable. If it's too large to load all in one go then you will need to investigate how to parse a JSON file incrementally.

answered May 09 '18 at 18:35

Nick Chapman

4,402
1
27
41

It works. Thanks. But now, I have the following error: "ValueError: Expecting , delimiter: line 320 column 435 (char 127674)" Is it possible to "ignore" this error and transform the json file into df? – Thabra May 09 '18 at 18:38
@Thabra That means that you have a problem in your json file. You should go to where it's telling you about and look at what's there – Nick Chapman May 09 '18 at 18:43

json file with duplicates keys to pandas df

1 Answers1