2

I have a python dictionary named data and have sub dictionaries inside it, such as

data = {'ind1' : {}, 'ind10' : {}, 'ind11' : {}, 'ind12' : {}, 'ind13', 'ind14' : {}, 'ind15' : {}, 'ind16' : {}, 'ind17' : {}, 'ind18' : {}, 'ind19' : {}, 'ind2' : {}, 'ind20' : {}, 'ind3' : {}, 'ind30' : : {}, 'ind31' : {} 'ind5' : {}, 'ind6' : {}, 'ind7' : {}, 'ind8' : {}, 'ind9' : {}}

I want to sort the data inside dictionary by key as

ind1
ind2 : {}
ind3 : {}
...
ind10 : {}
ind11 : {}

I tried data = collections.OrderedDict(sorted(data.items()))from the collections library

this is giving result as

ind1 : {}
ind11 : {}
ind12 : {}
ind13 : {}
.....
ind20 : {}
ind21 : {}
....
ind3 : {}
ind4 : {}
....

Please help

Jon Clements
  • 138,671
  • 33
  • 247
  • 280
  • 5
    You want to [natural sort](http://stackoverflow.com/questions/4836710/does-python-have-a-built-in-function-for-string-natural-sort) your keys first... – Jon Clements Aug 11 '15 at 06:40
  • 1
    Why do you want to sort the dictionary by key? Perhaps that's not actually what you need to do what you want to do. – Cyphase Aug 11 '15 at 06:55
  • Using `OrderedDict` may be a bad idea, the keys in `OrderedDict`s are sorted by insertion order, nothing else. Now if you have `ind1` and `ind3` inserted and nicely ordered you cannot insert `ind2` and expect it to be nicely ordered any longer. – skyking Aug 11 '15 at 07:07
  • Thanks, **natsort ** did the best, as it sorts alphabetically and naturally too – Kangkon Saikia Aug 11 '15 at 07:12

4 Answers4

1

Do you need to have the key prefixed with "ind"? You could use integers as the key which would sort correctly. Right now it is sorting alphabetically which is causing the issue.

If you can't, assuming your keys follow the same format, sort using this:

 collections.OrderedDict(sorted(data.items(), key=lambda kv: int(kv[0][3:])))

Which uses the integer after the prefix to sort.

cheniel
  • 3,473
  • 1
  • 14
  • 15
  • No the keys are some file names, there may b another prefix say foo1, bar2... or only foo, bar etc. – Kangkon Saikia Aug 11 '15 at 06:49
  • This code works nicely according your problem. You may prefer editing your problem by explaining some corner cases. – Vineet Kumar Doshi Aug 11 '15 at 06:53
  • @KangkonSaikia, you should always use real data (or as close to real data as possible) in your examples :). If the keys are going to be file names, then either use those exact file names, or at least file names that represent the variety in your real data. – Cyphase Aug 11 '15 at 06:56
0

Do you really want to sort the data inside the dictionary, or do you simply want to sort the list provided by .keys()?

If you are sorting only a list of values, then this link should contain what you need: https://wiki.python.org/moin/HowTo/Sorting

If you wish to sort data inside the dictionary, I'm intrigued to know why. I'll follow and add suggestions as you respond.

Good luck!

0

You need to use a natural sort algorithm on your keys if you want "ind10" to be after "ind9" ;^)

From wizkid on ActiveState

def keynat(string):
  r'''A natural sort helper function for sort() and sorted()
  without using regular expression.

  >>> items = ('Z', 'a', '10', '1', '9')
  >>> sorted(items)
  ['1', '10', '9', 'Z', 'a']
  >>> sorted(items, key=keynat)
  ['1', '9', '10', 'Z', 'a']
  '''
  r = []
  for c in string:
     if c.isdigit():
        if r and isinstance(r[-1], int):
           r[-1] = r[-1] * 10 + int(c)
        else:
           r.append(int(c))
     else:
        r.append(c)
  return r

data = collections.OrderedDict(
  sorted(
    data.iteritems(),
    key=lambda row:keynat(row[0])
  )
)
bufh
  • 3,153
  • 31
  • 36
0

If you don't need each key to start with 'ind', then they could all just be int's, and that would work as expected.

If they do need to be string's starting with 'ind', you could do this:

from __future__ import print_function

import pprint

from collections import OrderedDict


def ind_sort_key(item):
    item_key, item_value = item
    return int(item_key[3:])


data = {
    'ind1': {}, 'ind10': {}, 'ind11': {}, 'ind12': {}, 'ind13': {},
    'ind14': {}, 'ind15': {}, 'ind16': {}, 'ind17': {}, 'ind18': {},
    'ind19': {}, 'ind2': {}, 'ind20': {}, 'ind3': {}, 'ind30': {},
    'ind31': {}, 'ind5': {}, 'ind6': {}, 'ind7': {}, 'ind8': {}, 'ind9': {},
    }

sorted_data = OrderedDict(sorted(data.items(), key=ind_sort_key))

pprint.pprint(sorted_data)

This results in:

{'ind1': {},
 'ind2': {},
 'ind3': {},
 'ind5': {},
 'ind6': {},
 'ind7': {},
 'ind8': {},
 'ind9': {},
 'ind10': {},
 'ind11': {},
 'ind12': {},
 'ind13': {},
 'ind14': {},
 'ind15': {},
 'ind16': {},
 'ind17': {},
 'ind18': {},
 'ind19': {},
 'ind20': {},
 'ind30': {},
 'ind31': {}}
Cyphase
  • 11,502
  • 2
  • 31
  • 32