12

I have a list of dictionaries in Python. This list is being passed around as json between web services. Those web services create unique signatures based on the json being passed around. Part of creating the signature is normalizing the data payload and making sure that everything is in the correct order, so I'm doing this (in Python) - which works fine.

data = [{'a': '1', 'b': '2', 'c': 3}, {'d': 3}, {3: 1}, {'100': '200'}]
sorted(data)
> [{3: 1}, {'100': '200'}, {'d': 3}, {'a': '1', 'c': 3, 'b': '2'}]

Now, I need to add a C# application into the mix which needs to be able to create the exact same signature as the Python code does. I have not discovered the secret sauce to sort the above data structure in the same way as Python's sorted builtin function.

I'm using ServiceStack to parse the json data.

I was hoping that it would be as easy as doing something like this (in C#):

var jsonPayload = "[{\"a\": \"1\", \"b\": \"2\", \"c\": 3}, {\"d\": 3}, {3: 1}, {\"100\": \"200\"}]";                                                                        
var parsedJson = JsonArrayObjects.Parse(jsonPayload);                                                                                                                        
parsedJson.Sort();  

However, I get this exception from the above C# code:

`At least one object just implement IComparable`

I understand why I'm getting this error, but I'm not sure what I should do about it. I really was hoping that I would not have to roll my own sorting logic. The actual data that I'm dealing with is very dynamic. This is just an example of something that is preventing me from moving forward.

Does anyone have any suggestions or recommendations on how I can get a sort in C# to work like the sorted python function for this type of nested data structure?

Thanks!

Matthew J Morrison
  • 4,343
  • 3
  • 28
  • 45
  • Maybe you could use a signature based on the content instead of the JSON string? Also, when I try to run that `sorted` command in Python 3.2.1 (with no particular libraries for sorting that), it says `TypeError: unorderable types: dict() < dict()`. How does that comparison happen? – Tim S. Jan 10 '14 at 21:51
  • Sorry, I should I have stated that I am using Python 2.7.5. – Matthew J Morrison Jan 10 '14 at 21:53
  • This existing question might help possibly help you: http://stackoverflow.com/questions/736443/ironpython-and-c-sharp-script-access-to-c-sharp-objects – Adam Miller Jan 10 '14 at 21:59
  • You need `parsedJson.OrderBy(x => x, someComparator)` where `someComparator` implements the same thing that Python 2.7's dict's `__cmp__` does. I'm not sure how that works just yet, will look into it if I can... – Tim S. Jan 10 '14 at 22:01
  • http://stackoverflow.com/questions/3484293/is-there-a-description-of-how-cmp-works-for-dict-objects-in-python-2 The behaviour is unspecified, though, so it can and will probably very between implementions, so I would recommend strongly against using this as your normalization routine. I would do `sorted(data.items())` since it is obvious and documented what it does – Niklas B. Jan 10 '14 at 22:08
  • @NiklasB. data is a list of dictionaries not a dictionary. – Matthew J Morrison Jan 10 '14 at 22:11
  • @Matthew: I meant `sorted(data, key=lambda d: sorted(d.items()))` – Niklas B. Jan 10 '14 at 22:12
  • @NiklasB. that could work, however our data is arbitrarily nested, not necessarily just one level as demonstrated in the examples. – Matthew J Morrison Jan 10 '14 at 22:13
  • @MatthewJMorrison: Well, than you got some work to do. It's simple, though, just use recursion – Niklas B. Jan 10 '14 at 22:13
  • @MatthewJMorrison: Could you create your signature based on the raw payload instead of interpreting it to an object first? – Aaron Jan 11 '14 at 13:08
  • Aside from all the comments so far, @MatthewJMorrison, did you realize that `sorted` doesn't "normalize" your list beyond the first layer of nesting? You'd need a custom function one way or the other.. – Niklas B. Jan 13 '14 at 02:44
  • That's a strange sort, why does 'd' come before 'a' ? – Paul McCarthy Jul 28 '23 at 10:36

3 Answers3

0

The sample data isn't valid JSON as shown. The integer three cannot be a key, it must be a string. Change {3: 1} to {"3": 1}

The second issue is that C# dictionaries are not orderable by default. However, you can subclass them to make them orderable though.

The Python2.x algorithm for ordering dictionaries is:

1) If the sizes of the dictionaries are different, the one with the shorter length is the smaller value.

2) If the sizes are the same, then scan the first dictionary to find the smallest key in the first dictionary that is either not present or has a mismatched value in the second dictionary. The mismatched value determines which dictionary is the largest.

Here is the relevant extract from the Python2.7 source code for Objects/dictobject.c :

/* Subroutine which returns the smallest key in a for which b's value
   is different or absent.  The value is returned too, through the
   pval argument.  Both are NULL if no key in a is found for which b's status
   differs.  The refcounts on (and only on) non-NULL *pval and function return
   values must be decremented by the caller (characterize() increments them
   to ensure that mutating comparison and PyDict_GetItem calls can't delete
   them before the caller is done looking at them). */

static PyObject *
characterize(PyDictObject *a, PyDictObject *b, PyObject **pval)
{
    PyObject *akey = NULL; /* smallest key in a s.t. a[akey] != b[akey] */
    PyObject *aval = NULL; /* a[akey] */
    Py_ssize_t i;
    int cmp;

    for (i = 0; i <= a->ma_mask; i++) {
        PyObject *thiskey, *thisaval, *thisbval;
        if (a->ma_table[i].me_value == NULL)
            continue;
        thiskey = a->ma_table[i].me_key;
        Py_INCREF(thiskey);  /* keep alive across compares */
        if (akey != NULL) {
            cmp = PyObject_RichCompareBool(akey, thiskey, Py_LT);
            if (cmp < 0) {
                Py_DECREF(thiskey);
                goto Fail;
            }
            if (cmp > 0 ||
                i > a->ma_mask ||
                a->ma_table[i].me_value == NULL)
            {
                /* Not the *smallest* a key; or maybe it is
                 * but the compare shrunk the dict so we can't
                 * find its associated value anymore; or
                 * maybe it is but the compare deleted the
                 * a[thiskey] entry.
                 */
                Py_DECREF(thiskey);
                continue;
            }
        }

        /* Compare a[thiskey] to b[thiskey]; cmp <- true iff equal. */
        thisaval = a->ma_table[i].me_value;
        assert(thisaval);
        Py_INCREF(thisaval);   /* keep alive */
        thisbval = PyDict_GetItem((PyObject *)b, thiskey);
        if (thisbval == NULL)
            cmp = 0;
        else {
            /* both dicts have thiskey:  same values? */
            cmp = PyObject_RichCompareBool(
                                    thisaval, thisbval, Py_EQ);
            if (cmp < 0) {
                Py_DECREF(thiskey);
                Py_DECREF(thisaval);
                goto Fail;
            }
        }
        if (cmp == 0) {
            /* New winner. */
            Py_XDECREF(akey);
            Py_XDECREF(aval);
            akey = thiskey;
            aval = thisaval;
        }
        else {
            Py_DECREF(thiskey);
            Py_DECREF(thisaval);
        }
    }
    *pval = aval;
    return akey;

Fail:
    Py_XDECREF(akey);
    Py_XDECREF(aval);
    *pval = NULL;
    return NULL;
}

static int
dict_compare(PyDictObject *a, PyDictObject *b)
{
    PyObject *adiff, *bdiff, *aval, *bval;
    int res;

    /* Compare lengths first */
    if (a->ma_used < b->ma_used)
        return -1;              /* a is shorter */
    else if (a->ma_used > b->ma_used)
        return 1;               /* b is shorter */

    /* Same length -- check all keys */
    bdiff = bval = NULL;
    adiff = characterize(a, b, &aval);
    if (adiff == NULL) {
        assert(!aval);
        /* Either an error, or a is a subset with the same length so
         * must be equal.
         */
        res = PyErr_Occurred() ? -1 : 0;
        goto Finished;
    }
    bdiff = characterize(b, a, &bval);
    if (bdiff == NULL && PyErr_Occurred()) {
        assert(!bval);
        res = -1;
        goto Finished;
    }
    res = 0;
    if (bdiff) {
        /* bdiff == NULL "should be" impossible now, but perhaps
         * the last comparison done by the characterize() on a had
         * the side effect of making the dicts equal!
         */
        res = PyObject_Compare(adiff, bdiff);
    }
    if (res == 0 && bval != NULL)
        res = PyObject_Compare(aval, bval);

Finished:
    Py_XDECREF(adiff);
    Py_XDECREF(bdiff);
    Py_XDECREF(aval);
    Py_XDECREF(bval);
    return res;
}
Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485
0

You can work around it: Call the Python sorting function from C#, so you will have exactly the same behaviour.

You can use IronPython:

Python code:

def Simple():
    print "Hello from Python"
    print "Call Dir(): "
    print dir()

C# code:

using System;
using IronPython.Hosting;
using Microsoft.Scripting.Hosting;

public class dynamic_demo
{
    static void Main()
    {
        var ipy = Python.CreateRuntime();
        dynamic test = ipy.UseFile("Test.py");
        test.Simple();
    }
}

Complete example here.

http://blogs.msdn.com/b/charlie/archive/2009/10/25/hosting-ironpython-in-a-c-4-0-program.aspx

phyrox
  • 2,423
  • 15
  • 23
0

Another option is to call the Python sorting function from the command-line within your C# app. You can use this method:

private void run_cmd(string cmd, string args)
{
     ProcessStartInfo start = new ProcessStartInfo();
     start.FileName = "my/full/path/to/python.exe";
     start.Arguments = string.Format("{0} {1}", cmd, args);
     start.UseShellExecute = false;
     start.RedirectStandardOutput = true;
     using(Process process = Process.Start(start))
     {
         using(StreamReader reader = process.StandardOutput)
         {
             string result = reader.ReadToEnd();
             Console.Write(result);
         }
     }
}

More information about this at: How do I run a Python script from C#?

Community
  • 1
  • 1
phyrox
  • 2,423
  • 15
  • 23