4

Trying to convert this large json into csv.Here is the sample JSON

    [
    [{
            "User": "Phoebe",
            "text": "Oh my God, hes lost it. Hes totally lost it.",
            "sent": "non-neutral"
        },
        {
            "user": "Monica",
            "text": "What?",
            "sent": "surprise"
        }

    ],
    [{
            "user": "Joey",
            "text": "Hey Estelle, listen",
            "sent": "neutral"
        },
        {
            "user": "Estelle",
            "text": "Well! Well! Well! Joey Tribbiani! So you came back huh? They",
            "sent": "surprise"
        }
    ]
]


with open('/Users/dsg281/Downloads/EmotionLines/Friends/friends_dev.json') as data_file:    
        data = json.load(data_file) 

am trying to get output in csv with columns "User","text","sent"

Kum_R
  • 368
  • 2
  • 19

2 Answers2

4

I think need (after change json to valid):

file.json

[
  [
    {
      "user": "Phoebe",
      "text": "Oh my God, hes lost it. Hes totally lost it.",
      "sent": "non-neutral"
    },
    {
      "user": "Monica",
      "text": "What?",
      "sent": "surprise"
    }

   ],
   [{
      "user": "Joey",
      "text": "Hey Estelle, listen",
      "sent": "neutral"
    },
    {
      "user": "Estelle",
      "text": "Well! Well! Well! Joey Tribbiani! So you came back huh? They",
      "sent": "surprise"
    }
  ]     
]

import json

with open('file.json') as data_file:    
    data = json.load(data_file)  

df = pd.concat([pd.DataFrame(x) for x in data], ignore_index=False)
print (df)
          sent                                               text     user
0  non-neutral       Oh my God, hes lost it. Hes totally lost it.   Phoebe
1     surprise                                              What?   Monica
0      neutral                                Hey Estelle, listen     Joey
1     surprise  Well! Well! Well! Joey Tribbiani! So you came ...  Estelle

And then:

df.to_csv(file, index=False)

EDIT:

If want use non pandas pure python solution, is possible a bit modified with one for this solution:

import json, csv

with open('file.json') as data_file:    
    data = json.load(data_file)  

f = csv.writer(open("test.csv", "w+"))
f.writerow(["user", "sent", "text"])

for y in data:
    for x in y: #added for read nested lists
        f.writerow([x["user"], x["sent"], x["text"]])
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
2

We can concat the lists together with sum if we provide a starting value of an empty list []

pd.DataFrame(sum(json.load(open('file.json')), [])).to_csv('file.csv', index=False)

sent,text,user
non-neutral,"Oh my God, hes lost it. Hes totally lost it.",Phoebe
surprise,What?,Monica
neutral,"Hey Estelle, listen",Joey
surprise,Well! Well! Well! Joey Tribbiani! So you came back huh? They,Estelle
piRSquared
  • 285,575
  • 57
  • 475
  • 624