3

In my project I'm using flask I get a JSON (by REST API) that has data that I should convert to a pandas Dataframe. The JSON looks like:

{
    "entity_data":[
                  {"id": 1, "store": "a", "marker": "a"}
    ]
}

I get the JSON and extract the data:

params = request.json
entity_data = params.pop('entity_data')

and then I convert the data into a pandas dataframe:

entity_ids = pd.DataFrame(entity_data)

the result looks like this:

   id marker store
0   1      a     a

This is not the original order of the columns. I'd like to change the order of the columns as in the dictionary. help?

jpp
  • 159,742
  • 34
  • 281
  • 339
nofar mishraki
  • 526
  • 1
  • 4
  • 15
  • 1
    dictionaries do not maintain order. and why do you need to order the columns, you will be accessing them like `df['column_name']` anyway. Just like a dictionary, the order doesn't matter here. – Adithya Dec 25 '18 at 09:31
  • @nofar , please check the answers below and [accept](https://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work/5235#5235) one if any helps you so people can refer this post later . Thanks :) – anky Dec 25 '18 at 13:18
  • @Adithya, I need the order because I want to print the dataframe after some manipulations in the end in the same order. In addition, in my case the names of columns change dynamically (id,store and marker are just examples). – nofar mishraki Dec 25 '18 at 16:10
  • @anky_91 I still have no solution for my problem – nofar mishraki Dec 25 '18 at 16:11
  • @nofarmishraki check edited answer. :) – anky Dec 25 '18 at 16:37

3 Answers3

2

Just add the column names parameter.

entity_ids = pd.DataFrame(entity_data, columns=["id","store","marker"])
Chiheb.K
  • 156
  • 1
  • 4
  • 1
    but in my case the names of columns change dynamically (id,store and marker are just examples), do how can I get the original order of the names like in the dictionary? – nofar mishraki Dec 25 '18 at 16:06
  • This doesn't actually answer the question, which is how to retrieve/store json "as-is" without any change in item order within dictionaries. – jpp Dec 25 '18 at 18:02
2

Use OrderedDict for an ordered dictionary

You should not assume dictionaries are ordered. While dictionaries are insertion ordered in Python 3.7, whether or not libraries maintain this order when reading json into a dictionary, or converting the dictionary to a Pandas dataframe, should not be assumed.

The most reliable solution is to use collections.OrderedDict from the standard library:

import json
import pandas as pd
from collections import OrderedDict

params = """{
    "entity_data":[
                  {"id": 1, "store": "a", "marker": "a"}
    ]
}"""

# replace myjson with request.json
data = json.loads(params, object_pairs_hook=OrderedDict)
entity_data = data.pop('entity_data')

df = pd.DataFrame(entity_data)

print(df)

#    id store marker
# 0   1     a      a
jpp
  • 159,742
  • 34
  • 281
  • 339
1

Assuming you have access to JSON sender, you can send the order in the JSON itself.

like

`{
"order":['id','store','marker'],
"entity_data":{"id": [1,2], "store": ["a","b"],
"marker": ["a","b"]}
}

then create DataFrame with columns specified. as said by Chiheb.K.

import pandas as pd
params = request.json
entity_data = params.pop('entity_data')
order = params.pop('order')
entity_df=pd.DataFrame(data,columns=order)

if you cannot explicitly specify the order in the JSON. see this answer to specify object_pairs_hook in JSONDecoder to get an OrderedDict and then create the DataFrame

Adithya
  • 1,688
  • 1
  • 10
  • 18