0

I have a complex Dictionary that I want to unpack into a single DataFrame but I cannot figure it out. I want to unpack the data contained in 'rows' (ie all of is contained within []) into a single DataFrame ('rows') I have tried lots of combinations of accessing inside the Dictionary to no avail.

Here's the data:

{None: {'transfers': {'1': {'rows': [{'pointOfSaleID': 2,
  'initialAmount': '£0.00',
  'opened': 'xx, 27/11/2018 11:58',
  'dayIncome': '£336.23',
  'cash': [{'dateTime': '27/11/2018 18:23',
    'employeeName': 'xx',
    'sum': '-£45.00',
    'comment': 'cabs to collect in store stock\nEvents'}],
  'cashTotal': '£291.23',
  'cashExpected': '£291.23',
  'closed': 'xx, 27/11/2018 20:54',
  'banked': '£0.00',
  'left': '£0.00',
  'totalCounted': '£0.00',
  'difference': '-£291.23',
  'varianceReason': '',
  'totalTransactions': 48},
 {'pointOfSaleID': 2,
  'initialAmount': '£0.00',
  'opened': 'xx, 28/11/2018 09:16',
  'dayIncome': '£35.94',
  'cashTotal': '£35.94',
  'cashExpected': '£35.94',
  'closed': '----',
  'banked': '----',
  'left': '----',
  'totalCounted': '----',
  'difference': '----',
  'varianceReason': '',
  'totalTransactions': 3}...]

How can I access just the data within rows and unpack it into a DataFrame?

lmonty
  • 187
  • 8

2 Answers2

1

Assuming your dictionary is named 'd', this should get the list of nested dicts, with one dict per record (per row):

d[None]['transfers']['1']['rows']

You should be able to pass that into the DataFrame constructor:

df = pd.DataFrame(d[None]['transfers']['1']['rows'])

If that works, you will end up with nested dictionaries in each entry of the column cash. To fix that, I'd point you toward [json_normalize](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.io.json.json_normalize.html), which this SO thread might help you to understand: pandas.io.json.json_normalize with very nested json

Peter Leimbigler
  • 10,775
  • 1
  • 23
  • 37
0

from_dict function should work in your case. following documentation is pretty good: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.from_dict.html

Lorientas
  • 66
  • 6
  • 2
    I don't think `to_dict` is useful here. OP wants to convert some part of a list/dict structure into a DataFrame - `to_dict` does the reverse of this. – sjw Nov 28 '18 at 12:56
  • You are right, my bad. It should be from_dict. I will edit the response. – Lorientas Nov 28 '18 at 13:12