0

I have a large json file with 38 GB as this structure:

{"Text": [

    {
        "a" : 6,
        "b" : 2022,
        "c" : 11,
        "d" : "2022-11-24", 
        "e" : "567",
        "f" : "ww", 
        "i" : "00",  
        "j" : 4,
        "k" : "y",
        "l" : null,
        "m" : 3,
        "n" : 7,
        "o" : "54",
        "b" : null,
        "q" : "yes",
        "r" : 77,
        "c" : "yes",
        "t" : 6,
        "y" : 8,
        "v" : "yy",
        "w" : "yy",
        "x" : "o",
        "y" : "100                                                                                                                          ",

        "z" : "r"

    },{...},{....}

    ]}

so I have created this code to convert the json to dataFrame, but I get memory error.

 df=pd.DataFrame([])
with open('data.json',encoding='utf-8-sig') as f:
    l=f.read()
    c=json.loads(l)
    v=list(c.keys())[0]
    data=(c[v])

    for line in itertools.islice(data,2):
        json_nor=json_normalize(line)
        df=df.append(json_nor, ignore_index = True)

any advice how to convert them and avoid memory error?

Fatima
  • 497
  • 5
  • 21

0 Answers0