15

I have a dictionary which is converted from a dataframe as below :

a = d.to_json(orient='index')

Dictionary :

{"0":{"yr":2017,"PKID":"58306, 57011","Subject":"ABC","ID":"T001"},"1":{"yr":2018,"PKID":"1234,54321","Subject":"XYZ","ID":"T002"}}

What I need is it be in a list, so essentially a list of dictionary. So i just add a [] because that is the format to be used in the rest of the code.

input_dict = [a]

input_dict :

['
{"0":{"yr":2017,"PKID":"58306, 57011","Subject":"ABC","ID":"T001"},"1":{"yr":2018,"PKID":"1234,54321","Subject":"XYZ","ID":"T002"}}
']

I need to get the single quotes removed just after the [ and just before the ]. Also, have the PKID values in form of list.

How can this be achieved ?

Expected Output :

[ {"yr":2017,"PKID":[58306, 57011],"Subject":"ABC","ID":"T001"},"1":{"yr":2018,"PKID":[1234,54321],"Subject":"XYZ","ID":"T002"} ]

NOTE : The PKID column has multiple integer values which have to come as a lift of integers. a string is not acceptable. so we need like "PKID":[58306, 57011] and not "PKID":"[58306, 57011]"

Shankar Pandey
  • 451
  • 1
  • 4
  • 22

5 Answers5

30

pandas.DataFrame.to_json returns a string (JSON string), not a dictionary. Try to_dict instead:

>>> df
   col1  col2
0     1     3
1     2     4
>>> [df.to_dict(orient='index')]
[{0: {'col1': 1, 'col2': 3}, 1: {'col1': 2, 'col2': 4}}]
>>> df.to_dict(orient='records')
[{'col1': 1, 'col2': 3}, {'col1': 2, 'col2': 4}]
Norrius
  • 7,558
  • 5
  • 40
  • 49
  • Thanks, is there also a way to NOT have index values in and in the same format ? like just : [' {"yr":2017,"PKID":"58306, 57011","Subject":"ABC","ID":"T001"},{"yr":2018,"PKID":"1234,54321","Subject":"XYZ","ID":"T002"} '] – Shankar Pandey Feb 28 '18 at 11:19
  • 2
    @ShankarPandey I added another example – Norrius Feb 28 '18 at 11:25
  • Thanks. Is there a way to also make one column values as a LIST. so if col2 had 2 integers seperated by comma, how would we make that an example below : [{'col1': 1, 'col2': [3, 4] }, {'col1': 2, 'col2': [5,6] }] – Shankar Pandey Feb 28 '18 at 11:45
  • @ShankarPandey Just iterate through the list and transform your values: `d['PKID'] = list(map(int, d['PKID'].split(',')))` – Norrius Feb 28 '18 at 12:21
3

Here is one way:

from collections import OrderedDict

d = {"0":{"yr":2017,"PKID":"58306, 57011","Subject":"ABC","ID":"T001"},"1":{"yr":2018,"PKID":"1234,54321","Subject":"XYZ","ID":"T002"}}

list(OrderedDict(sorted(d.items())).values())

# [{'ID': 'T001', 'PKID': '58306, 57011', 'Subject': 'ABC', 'yr': 2017},
#  {'ID': 'T002', 'PKID': '1234,54321', 'Subject': 'XYZ', 'yr': 2018}]

Note the ordered dictionary is ordered by text string keys, as supplied. You may wish to convert these to integers first before any processing via d = {int(k): v for k, v in d.items()}.

jpp
  • 159,742
  • 34
  • 281
  • 339
  • 1
    dicts are unordered, so this won't preserve the (eventual) ordering implied by the keys... Might or not be an issue for the the OP... – bruno desthuilliers Feb 28 '18 at 11:11
  • Close but not quite there - here the keys are strings so you'll get lexical ordering (ie : `sorted(["1", "2", "10", "11"])` => `['1', '10', '11', '2']`). You want to convert keys to ints before IMHO ;) – bruno desthuilliers Feb 28 '18 at 11:15
  • 1
    As I originally mentionned : this "Might or not be an issue for the the OP" - but we actually don't know since the OP didn't post the exact expected output (and obviously everyone interpreted it differently) ;) I only wanted to make clear that your first solution would eventually loose ordering and that the second would use lexical ordering instead of numerical ordering, that's all. – bruno desthuilliers Feb 28 '18 at 11:23
0

You are converting your dictionary to json which is a string. Then you wrap your resulting string a list. So, naturally, the result is a string inside of a list.

Try instead: [d] where d is your raw dictionary (not converted json

Ryan
  • 658
  • 8
  • 22
0

You can use a list comprehension

Ex:

d = {"0":{"yr":2017,"PKID":"58306, 57011","Subject":"ABC","ID":"T001"},"1":{"yr":2018,"PKID":"1234,54321","Subject":"XYZ","ID":"T002"}}
print [{k: v} for k, v in d.items()]

Output:

[{'1': {'PKID': '1234,54321', 'yr': 2018, 'ID': 'T002', 'Subject': 'XYZ'}}, {'0': {'PKID': '58306, 57011', 'yr': 2017, 'ID': 'T001', 'Subject': 'ABC'}}]
Rakesh
  • 81,458
  • 17
  • 76
  • 113
0

What about something like this:

from operator import itemgetter

d = {"0":{"yr":2017,"PKID":"58306, 57011","Subject":"ABC","ID":"T001"},"1":
    {"yr":2018,"PKID":"1234,54321","Subject":"XYZ","ID":"T002"}}

sorted_d = sorted(d.items(), key=lambda x: int(x[0]))

print(list(map(itemgetter(1), sorted_d)))

Which Outputs:

[{'yr': 2017, 'PKID': '58306, 57011', 'Subject': 'ABC', 'ID': 'T001'}, 
 {'yr': 2018, 'PKID': '1234,54321', 'Subject': 'XYZ', 'ID': 'T002'}]
RoadRunner
  • 25,803
  • 6
  • 42
  • 75