0

The format in the file looks like this

{ 'match' : 'a', 'score' : '2'},{......}

I've tried pd.DataFrame and I've also tried reading it by line but it gives me everything in one cell

I'm new to python

Thanks in advance

Expected result is a pandas dataframe

  • Do the columns always match? I.E. Is the JSON file {'match': 'a', 'score': '2'}, {'match':'b', 'score':'3'}? Or do the dictionaries have different key-value pairs? – Arjun Arun Apr 15 '19 at 17:50
  • Please, don't forget to [accept](https://stackoverflow.com/help/someone-answers) one of the answer, @bsa1player. – Jaroslav Bezděk Aug 16 '19 at 11:20

2 Answers2

0

Try use json_normalize() function

Example:

from pandas.io.json import json_normalize

values = [{'match': 'a', 'score': '2'}, {'match': 'b', 'score': '3'}, {'match': 'c', 'score': '4'}]
df = json_normalize(values)
print(df)

Output:

enter image description here

Kafels
  • 3,864
  • 1
  • 15
  • 32
0

If one line of your file corresponds to one JSON object, you can do the following:

# import library for working with JSON and pandas
import json 
import pandas as pd

# make an empty list
data = []

# open your file and add every row as a dict to the list with data
with open("/path/to/your/file", "r") as file:
    for line in file:
        data.append(json.loads(line))

# make a pandas data frame
df = pd.DataFrame(data)

If there is more than only one JSON object on one row of your file, then you should find those JSON objects, for example here are two possible options. The solution with the second option would look like this:

# import all you will need
import pandas as pd
import json
from json import JSONDecoder

# define function
def extract_json_objects(text, decoder=JSONDecoder()):
    pos = 0
    while True:
        match = text.find('{', pos)
        if match == -1:
            break
        try:
            result, index = decoder.raw_decode(text[match:])
            yield result
            pos = match + index
        except ValueError:
            pos = match + 1

# make an empty list
data = []

# open your file and add every JSON object as a dict to the list with data
with open("/path/to/your/file", "r") as file:
    for line in file:
        for item in extract_json_objects(line):
            data.append(item)

# make a pandas data frame
df = pd.DataFrame(data)
Jaroslav Bezděk
  • 6,967
  • 6
  • 29
  • 46