4

I have data structure like the below and would like to sort the all the sub dictionaries to be sorted based on the values of the 'order' column.

Input:

to_sort = [
('Fruits', 
    {
    'size': {1:[4, 2, 7,9]}, 
    'name': {1:['Orange', 'Apple', 'Kiwi', 'Mango']},
    'color': {1:['Orange', 'Red', 'Brown','Green']},
    'order': {1:[2, 1, 4,3]}
    }
)
]

output:

[
('Fruits', 
    {
    'size': {1:[2, 4, 9, 7]}, 
    'name': {1:['Apple', 'Orange', 'Mango', 'Kiwi']},
    'color':{1:['Red', 'Orange', 'Green', 'Brown']},
    'order':{1:[1, 2, 3, 4]}
    }
)
]

I tried using the lambda

sort = to_sort[1]
print(sorted(sort.items(), key=lambda i: i['order'].values()))

i am getting "tuple indices must be integers or slices, not str" error

juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172
Sandy
  • 313
  • 1
  • 14
  • 2
    Is there a reason for each item in the dictionary being a dictionary with a single list? Can those instead just be lists? – FamousJameous Aug 31 '16 at 20:27
  • Wait, so you want to sort the *list* values of the sub-dictionaries. Dictionaries are unordered, after all... – juanpa.arrivillaga Aug 31 '16 at 20:28
  • Yea, what you want to do is possible, but is a bit of a headache. I know it's not always possible, but I would rethink your data structure here... Like FamousJameous points out, why can't e.g. `size` simply have a value of `[2, 4, 9, 7]`? – juanpa.arrivillaga Aug 31 '16 at 20:32
  • That's not a valid dictionary. Can you fix it? – ayhan Aug 31 '16 at 20:38
  • 1
    @ayhan I know you're just itching to get this into a DataFrame... – juanpa.arrivillaga Aug 31 '16 at 20:39
  • @Juanpa.Arrivillaga solution is outstanding but the array are not the same when reading, they come in different order. i take FamousJameous solution as reliable. Thank you all. – Sandy Sep 01 '16 at 18:47
  • @Sandy what do you mean the array comes in a different order? What array? – juanpa.arrivillaga Sep 01 '16 at 22:59
  • @Juanpa.Arrivillaga The Values in the array come out in different Order each time. There is a catch. "{'color': {1: "can be "{'color': {12:" or "{'color': {123:" in that case sorted_values = zip(*sorted(zip(*(to_sort[0][1][k][1] will not work. We have to substitute the whole value sorted_values = zip(*sorted(zip(*(to_sort[0][1][k]["value"]. any idea how to do it – Sandy Sep 02 '16 at 15:56
  • @Sandy sure, my answer has been edited. – juanpa.arrivillaga Sep 02 '16 at 19:43
  • @juanpa.arrivillaga Thanks – Sandy Sep 06 '16 at 13:45

2 Answers2

1

Assuming you are okay with modifying your data structure as mentioned in the comments, this will work for you. This is adapted from this other question: Sorting list based on values from another list?

to_sort = [('Fruits', {
    'size': [4, 2, 7,9],
    'name': ['Orange', 'Apple', 'Kiwi', 'Mango'],
    'color': ['Orange', 'Red', 'Brown','Green'],
    'order': [2, 1, 4,3]
    })
]

postsort = []
for category, catdata in to_sort:
    sorteddata = {}
    for name, namedata in catdata.iteritems():
        sorteddata[name] = [x for (y,x) in sorted(zip(catdata['order'], namedata))]
    postsort.append((category, sorteddata))
print postsort

Which results in:

[(
    'Fruits',
    {
        'color': ['Red', 'Orange', 'Green', 'Brown'],
        'size': [2, 4, 9, 7],
        'order': [1, 2, 3, 4],
        'name': ['Apple', 'Orange', 'Mango', 'Kiwi']
    }
)]

This could be modified to work with your existing data structure, but I would recommend making the change if that is possible.

Community
  • 1
  • 1
FamousJameous
  • 1,565
  • 11
  • 25
1

How to deal with what you've got

Your existing data structure is a bit crazy, but here is how I would handle it (edit suppose the key for the color list was 123):

>>> to_sort = [
... ('Fruits', 
...     {
...     'size': {1:[4, 2, 7,9]}, 
...     'name': {1:['Orange', 'Apple', 'Kiwi', 'Mango']},
...     'color': {123:['Orange', 'Red', 'Brown','Green']},
...     'order': {1:[2, 1, 4,3]}
...     }
... )
... ]
>>> d = to_sort[0][1]
>>> keys = list(d.keys())
>>> idx = keys.index('order')
>>> ordered_kv = zip(keys, zip(*sorted(zip(*(d[k][n] for k in keys for n in d[k])), key = lambda t:t[idx])))
>>> sorted_dict = {k:{n:list(v) for n in d[k]} for k,v in ordered_kv}
>>> from pprint import pprint
>>> pprint(sorted_dict)
{'color': {123: ['Red', 'Orange', 'Green', 'Brown']},
 'name': {1: ['Apple', 'Orange', 'Mango', 'Kiwi']},
 'order': {1: [1, 2, 3, 4]},
 'size': {1: [2, 4, 9, 7]}}

Let's break this down: First, I made a canonical list of keys and find the index of 'order':

>>> keys = list(to_sort[0][1].keys())
>>> idx = keys.index('order')

The next step is to zip the internal lists together into tuples where each of the items share the same relative position:

>>> list(zip(*(d[k][n] for k in keys for n in d[k])))
[(4, 2, 'Orange', 'Orange'), (2, 1, 'Red', 'Apple'), (7, 4, 'Brown', 'Kiwi'), (9, 3, 'Green', 'Mango')]

This can be sorted now according the the idx position and then "unzipped" (which really just means applying the zip-splat combination again:

>>> list(zip(*sorted(zip(*(d[k][n] for k in keys for n in d[k])), key=lambda t:t[idx])))
[(2, 4, 9, 7), (1, 2, 3, 4), ('Red', 'Orange', 'Green', 'Brown'), ('Apple', 'Orange', 'Mango', 'Kiwi')]

And finally, you rebuild your crazy dictionary with a dictionary comprehension, making sure to zip up your ordered values with the original keys:

>>> ordered_kv = zip(keys, zip(*sorted(zip(*(d[k][n] for k in keys for n in d[k])), key = lambda t:t[idx])))
>>> sorted_dict = {k:{n:list(v) for n in d[k]} for k,v in ordered_kv}
>>> from pprint import pprint
>>> pprint(sorted_dict)
{'color': {123: ['Red', 'Orange', 'Green', 'Brown']},
 'name': {1: ['Apple', 'Orange', 'Mango', 'Kiwi']},
 'order': {1: [1, 2, 3, 4]},
 'size': {1: [2, 4, 9, 7]}}

However...

You should really consider using the pandas library for manipulating data like this. Observe:

>>> import pandas as pd
>>> df = pd.DataFrame({k: pd.Series(v[1]) for k,v in to_sort[0][1].items()})
>>> df
    color    name  order  size
0  Orange  Orange      2     4
1     Red   Apple      1     2
2   Brown    Kiwi      4     7
3   Green   Mango      3     9

Notice I still had to finagle your original data structure into a pandas DataFrame, but if you start whatever you are doing with a DataFrame to begin with, it will all be much easier. Now you can do cool things like:

>>> df.sort_values('order')
    color    name  order  size
1     Red   Apple      1     2
0  Orange  Orange      2     4
3   Green   Mango      3     9
2   Brown    Kiwi      4     7
Community
  • 1
  • 1
juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172
  • There is a catch. "{'color': {1: "can be "{'color': {12:" or "{'color': {123:" in that case sorted_values = zip(*sorted(zip(*(to_sort[0][1][k][1] will not work. We have to substitute the whole value sorted_values = zip(*sorted(zip(*(to_sort[0][1][k]["value"]. any idea how to do it. – Sandy Sep 01 '16 at 17:27