3

I have a nested dictionary that I want to turn into a dataframe. When I use

pd.DataFrame(my_dict)

It modifies the order of the columns to be in alphabetical order. I want the order to be as input.

There is a problem almost exactly like this here:

Pandas: create dataframe without auto ordering column names alphabetically

The accepted answer has two solutions. The first solution in my case I do not think would work or would at least be tedious and not as readable because my dictionary is nested and much longer than his.

The second solution involves using collections.OrderedDict to create an ordered dict which is then converted into a dataframe. This supposedly should fix the problem but it does not for me. The dataframe is still ordered alphabetically. I think it may have to do with the fact that my dictionary is nested. I tried using collections.OrderedDict on all nested dicts and it still did not work. Well it worked, but did not change my sorted column issue. Here is my code:

my_dict = collections.OrderedDict()
code code code
for fname in os.listdir(myfile)
labels = collections.OrderedDict({A : 1, C : 2, B : 3, etc})
my_dict.update({fname : labels})

Obviously this is very simplified, but it shows the general idea. I make an empty ordered dictionary, then sort through a file and collect labels with values and store them in an ordered dictionary, then update the my_dict with the fname and labels ordered dict.

The dataframe that is output when I use pd.DataFrame(my_dict).T orders the columns (e.g. A,C,B) in alphabetical order. I would like it to be in input order.

If you know why my dataframe is still auto-sorting alphabetically or of another way to sort this please let me know!

masked
  • 81
  • 7
  • I later versions of python (>= 3.6), the order of keys in a vanilla python dict is remembered. So something like `pandas.DataFrame(dict(a=[1], d=[2],c=[3]))` will have a columns array ['a','d','c']. Is using a more recent version of python an option? – Chris Jul 23 '20 at 20:35
  • No it is not and I probably could do `pandas.DataFrame(my_dict(a=[1], d=[2], etc)' However, I would like to not have to type out all of it since I have a ton of columns. I will try it though and use it for now unless somebody knows a different solution. – masked Jul 23 '20 at 20:46
  • I have the exact same issue and it is super annoying as the console output in new order does not make any sense at all. Will post if I find a solution. – OnceUponATime Apr 07 '22 at 11:09

1 Answers1

0

Pandas will not automatically re-order columns if the arguments axis=1 and sort=False are used.

Here is an example of appending a dataframe df (read from spreadsheet) to an empty dataframe f while preserving the original column order:

f = f.append(df, axis=1, ignore_index=False, sort=False)

ignore_index=False preserves the labels of the original columns.

OnceUponATime
  • 450
  • 4
  • 12