-2

Trying to figure out how to write json to pandas DataFrame. In code I have:

...
resp = s.get(URL)
df = pd.DataFrame.from_dict(resp.json())
print(df.to_string())

My response in json looks like this:

{
    "instrument": "EUR_USD",
    "granularity": "S5",
    "candles": [
        {
            "complete": true,
            "volume": 8,
            "time": "2020-02-28T17:22:55.000000000Z",
            "mid": {
                "o": "1.10087",
                "h": "1.10088",
                "l": "1.10083",
                "c": "1.10083"
            }
        },
        {
            "complete": false,
            "volume": 7,
            "time": "2020-02-28T17:23:00.000000000Z",
            "mid": {
                "o": "1.10084",
                "h": "1.10084",
                "l": "1.10078",
                "c": "1.10078"
            }
        }
    ]
}

If I print df, I get this:

  instrument granularity                                                                                                                                               candles
0    EUR_USD          S5   {'complete': True, 'volume': 17, 'time': '2020-02-28T17:26:55.000000000Z', 'mid': {'o': '1.10022', 'h': '1.10023', 'l': '1.10014', 'c': '1.10014'}}
1    EUR_USD          S5  {'complete': False, 'volume': 19, 'time': '2020-02-28T17:27:00.000000000Z', 'mid': {'o': '1.10012', 'h': '1.10024', 'l': '1.10012', 'c': '1.10024'}}

How can I manipulate writing json to pandas df, so that my df would look like:

time                                open        high     low      close    volume
2020-02-28T17:26:55.000000000Z      1.10022     1.10023  1.10014  1.10014  17
...
Beerus
  • 34
  • 8
  • Have you done any research, or tried to write a solution yourself? – AMC Feb 28 '20 at 20:25
  • Does this answer your question? [JSON to pandas DataFrame](https://stackoverflow.com/questions/21104592/json-to-pandas-dataframe) – AMC Feb 28 '20 at 20:25
  • Yes I did. What's the purpose of your question? – Beerus Feb 29 '20 at 08:08
  • Then can you include it in your post? It’s relevant information, after all. – AMC Feb 29 '20 at 12:34
  • If you scroll down, there's an accepted answer already. Thank you for your input. – Beerus Feb 29 '20 at 21:49
  • I know, but it’s not because a post has an answer that it should be completely abandoned, right? Wouldn’t that defeat the purpose of this site? – AMC Feb 29 '20 at 21:51
  • Stackoverflow is >11 years old, you're member for 11 months. Worrying too much? Again, thank you for your input and if you want to chat - find someone else. – Beerus Mar 01 '20 at 13:57

1 Answers1

1

You can parse out the remaining nested JSON column from your DataFrame using json_normalize, a handy function under pandas.io.json.

from pandas.io.json import json_normalize

# Your current DataFrame
df
  instrument granularity                                            candles
0    EUR_USD          S5  {'complete': True, 'volume': 8, 'time': '2020-...
1    EUR_USD          S5  {'complete': False, 'volume': 7, 'time': '2020...

# Parse the `candles` column
parsed = json_normalize(df['candles'])
parsed
   complete  volume                            time    mid.o    mid.h    mid.l    mid.c
0      True       8  2020-02-28T17:22:55.000000000Z  1.10087  1.10088  1.10083  1.10083
1     False       7  2020-02-28T17:23:00.000000000Z  1.10084  1.10084  1.10078  1.10078

# Concat and rename columns as needed
renamer = {'mid.o': 'open', 
           'mid.h': 'high', 
           'mid.l': 'low', 
           'mid.c': 'close'}
res = pd.concat([df, parsed], axis=1)
res = res.drop(columns=['complete', 'candles']).rename(columns=renamer)
res
  instrument granularity  volume                            time     open     high      low    close
0    EUR_USD          S5       8  2020-02-28T17:22:55.000000000Z  1.10087  1.10088  1.10083  1.10083
1    EUR_USD          S5       7  2020-02-28T17:23:00.000000000Z  1.10084  1.10084  1.10078  1.10078
Peter Leimbigler
  • 10,775
  • 1
  • 23
  • 37