14

When storing data in a json object with to_json, and reading it back with read_json, rows and columns are returned sorted alphabetically. Is there a way to keep the results ordered or reorder them upon retrieval?

Alon
  • 743
  • 10
  • 23

3 Answers3

17

You could use orient='split', which stores the index and column information in lists, which preserve order:

In [34]: df
Out[34]: 
   A  C  B
5  0  1  2
4  3  4  5
3  6  7  8

In [35]: df.to_json(orient='split')
Out[35]: '{"columns":["A","C","B"],"index":[5,4,3],"data":[[0,1,2],[3,4,5],[6,7,8]]}'

In [36]: pd.read_json(df.to_json(orient='split'), orient='split')
Out[36]: 
   A  C  B
5  0  1  2
4  3  4  5
3  6  7  8

Just remember to use orient='split' on reading as well, or you'll get

In [37]: pd.read_json(df.to_json(orient='split'))
Out[37]: 
  columns       data  index
0       A  [0, 1, 2]      5
1       C  [3, 4, 5]      4
2       B  [6, 7, 8]      3
DSM
  • 342,061
  • 65
  • 592
  • 494
  • 1
    i have a bunch of dates, thats getting converted to epoch timestamps whereas the original format is YYYY-MM-DD. How can i retain the order, and date formats? – Naveen Sep 22 '18 at 07:12
  • @Naveen, you need manually to set _dtypes_ after `read_json()`, like this `df['date_operation'] = df['date_operation'].astype('datetime64[ms]') . This convert default milliseconds produced by `df.to_json()` – Egor B Eremeev Dec 03 '20 at 17:02
  • Is the written json index order `json.loads(df.to_json(orient='split'))['index']` **guaranteed** to be identical to the original index `df.index.to_list()` ??? It's not written in the documentation as I can see. That would save us from deserializing to get the index. – Martin Thøgersen May 04 '22 at 09:08
0

If you want to make a format with "orient='records'" and keep orders of the column, try to make a function like this. I don't think it is a wise approach, and do not recommend because it does not guarantee its order.

def df_to_json(df):
    res_arr = []
    ldf = df.copy()
    ldf=ldf.fillna('')
    lcolumns = [ldf.index.name] + list(ldf.columns)
    for key, value in ldf.iterrows():
        lvalues = [key] + list(value)
        res_arr.append(dict(zip(lcolumns, lvalues)))
    return json.dumps(res_arr)

In addition, for reading without sorted column please ref this [link] (Python json.loads changes the order of the object)

Good Luck

Kwang-Chun Kang
  • 351
  • 3
  • 12
0

lets say you have a pandas dataframe, that you read

import pandas as pd
df = pd.read_json ('/abc.json')
df.head()

that give following enter image description here

now there are two ways to save to json using pandas to_json result.sample(200).to_json('abc_sample.json',orient='split') that will give the order like this one column enter image description here

however, to preserve the order like in csv, use this one

result.sample(200).to_json('abc_sample_2nd.json',orient='records')

this will give result as enter image description here

Shaina Raza
  • 1,474
  • 17
  • 12