27

I have a list similar to

allsites = [
    {
        'A5': 'G', 
        'A10': 'G', 
        'site': 'example1.com', 
        'A1': 'G'
    }, 
    {
        'A5': 'R', 
        'A10': 'Y',
        'site': 'example2.com', 
        'A1': 'G'
    }
]

Which I use in a json.dumps:

data = { 'Author':"joe", 'data':allsites }
print json.dumps(data,sort_keys=True,indent=4, separators=(',', ': '))

This outputs the following JSON:

{
    "Author": "joe",
    "data": [
        {
            "A1": "G",
            "A10": "G",
            "A5": "G",
            "site": "example1.com"
        },
        {
            "A1": "G",
    (...)

I would like the "data" section of this JSON string to be sorted via a custom key ("alphabet"), in the case above this would be site, A1, A5, A10 and actually look like:

{
    "Author": "joe",
    "data": [
        {
            "site": "example1.com",
            "A1": "G",
            "A5": "G",
            "A10": "G"
        },
        {
            "site": "example2.com",
            "A1": "G",
    (...)

I read about custom sorting in the Sorting FAQ but it just gives a way to override the comparison function, not to mention that I do not know how to insert this into my code.

How to do that?

Remi Guan
  • 21,506
  • 17
  • 64
  • 87
WoJ
  • 27,165
  • 48
  • 180
  • 345

3 Answers3

32

Since python dicts are unordered collections, use collections.OrderedDict with a custom sort:

from collections import OrderedDict
import json

allsites = [
    {
        'A5': 'G',
        'A10': 'G',
        'site': 'example1.com',
        'A1': 'G'
    },
    {
        'A5': 'R',
        'A10': 'Y',
        'site': 'example2.com',
        'A1': 'G'
    }
]

sort_order = ['site', 'A1', 'A5', 'A10']
allsites_ordered = [OrderedDict(sorted(item.iteritems(), key=lambda (k, v): sort_order.index(k)))
                    for item in allsites]

data = {'Author': "joe", 'data': allsites_ordered}
print json.dumps(data, indent=4, separators=(',', ': '))

prints:

{
    "data": [
        {
            "site": "example1.com",
            "A1": "G",
            "A5": "G",
            "A10": "G"
        },
        {
            "site": "example2.com",
            "A1": "G",
            "A5": "R",
            "A10": "Y"
        }
    ],
    "Author": "joe"
}
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
  • 2
    Don't work more for python3, @alecxe, it say: "tuple parameter unpacking is not supported in python 3". See Scott Answer. – Rodrigo Sep 22 '16 at 01:21
20

In Python3, alecxe's answer no longer works. This should be a comment, but I lack the reputation.

PEP 3113 removed tuple unpacking in function signatures, so the line

allsites_ordered = [OrderedDict(sorted(item.iteritems(), key=lambda (k, v): sort_order.index(k)))
                    for item in allsites]

now has to be

allsites_ordered = [OrderedDict(sorted(item.items(), key=lambda item: sort_order.index(item[0])))
                    for item in allsites]

or similar. iteritems has also become just items.

Scott Colby
  • 1,370
  • 12
  • 25
  • And in Python 3.6, the dict is now ordered by default. That is, it works and somehow we shouldn't "plan on it working". – Charles Merriam Aug 29 '17 at 00:07
  • @CharlesMerriam Agreed. There's a bit of absurdity surrounding dictionary ordering at the moment. The biggest issue, I suppose, is that non-CPython implementations of the language might not have "gotten things in order yet" so to speak. – Scott Colby Aug 30 '17 at 00:31
  • 2
    Small update: as of Python 3.7, the `dict` is ordered by default and we *can* count on it--it's become part of the spec. There still is an argument to be made for using an `OrderedDict` as a semantic indicator that you're using the ordering property. Additionally, the `OrderedDict` keeps its extra `popitem()` and `move_to_end()` methods that the builtin `dict` still lacks. More details in this answer: https://stackoverflow.com/a/50872567/600882 – Scott Colby Nov 10 '18 at 00:19
6

I had exactly the same problem and devised a lightweight general solution:

from collections import OrderedDict

def make_custom_sort(orders):
    orders = [{k: -i for (i, k) in enumerate(reversed(order), 1)} for order in orders]
    def process(stuff):
        if isinstance(stuff, dict):
            l = [(k, process(v)) for (k, v) in stuff.items()]
            keys = set(stuff)
            for order in orders:
                if keys.issuperset(order):
                    return OrderedDict(sorted(l, key=lambda x: order.get(x[0], 0)))
            return OrderedDict(sorted(l))
        if isinstance(stuff, list):
            return [process(x) for x in stuff]
        return stuff
    return process

First, you create an instance of a custom-order sorting function:

custom_sort = make_custom_sort([ ["site", "A1", "A5", "A10"] ])

Now, the actual sorting:

result = custom_sort(allsites)

... which you may dump as a JSON object:

print json.dumps(result, indent=4)

Result

[
    {
        "site": "example1.com", 
        "A1": "G", 
        "A5": "G", 
        "A10": "G"
    }, 
    {
        "site": "example2.com", 
        "A1": "G", 
        "A5": "R", 
        "A10": "Y"
    }
]

More

The closure is recursive. As indicated by the double brackets, you could specify as many sort orders as the various dictionaries nested in your structure would require.

Project on GitHub: https://github.com/laowantong/customsort

Aristide
  • 3,606
  • 2
  • 30
  • 50