Elegantly Generalising Sorting into Dictionaries in Python?

Question

The list comprehension is a great structure for generalising working with lists in such a way that the creation of lists can be managed elegantly. Is there a similar tool for managing Dictionaries in Python?

I have the following functions:

    # takes in 3 lists of lists and a column specification by which to group
def custom_groupby(atts, zmat, zmat2, col):
    result = dict()
    for i in range(0, len(atts)):
        val = atts[i][col]
        row = (atts[i], zmat[i], zmat2[i])
        try:
            result[val].append(row)
        except KeyError:
            result[val] = list()
            result[val].append(row)
    return result

    # organises samples into dictionaries using the groupby
def organise_samples(attributes, z_matrix, original_z_matrix):
    strucdict = custom_groupby(attributes, z_matrix, original_z_matrix, 'SecStruc')

    strucfrontdict = dict()
    for k, v in strucdict.iteritems():
        strucfrontdict[k] = custom_groupby([x[0] for x in strucdict[k]],                                            
                                [x[1] for x in strucdict[k]], [x[2] for x in strucdict[k]], 'Front')

    samples = dict()
    for k in strucfrontdict:
        samples[k] = dict()
        for k2 in strucfrontdict[k]:
            samples[k][k2] = dict()
            samples[k][k2] = custom_groupby([x[0] for x in strucfrontdict[k][k2]],
                    [x[1] for x in strucfrontdict[k][k2]], [x[2] for x in strucfrontdict[k][k2]], 'Back')
    return samples

It seems like this is unwieldy. There being elegant ways to do almost everything in Python, I'm inclined to think I'm using Python wrongly.

More importantly, I'd like to be able to generalise this function better so that I can specify how many "layers" should be in the dictionary (without using several lambdas and approaching the problem in a Lisp style). I would like a function:

# organises samples into a dictionary by specified columns
# number of layers could also be assumed by number of criterion
def organise_samples(number_layers, list_of_strings_for_column_ids)

Is this possible to do in Python?

Thank you! Even if there isn't a way to do it elegantly in Python, any suggestions towards making the above code more elegant would be really appreciated.

::EDIT::

For context, the attributes object, z_matrix, and original_zmatrix are all lists of Numpy arrays.

Attributes might look like this:

Type,Num,Phi,Psi,SecStruc,Front,Back
11,181,-123.815,65.4652,2,3,19
11,203,148.581,-89.9584,1,4,1
11,181,-123.815,65.4652,2,3,19
11,203,148.581,-89.9584,1,4,1
11,137,-20.2349,-129.396,2,0,1
11,163,-34.75,-59.1221,0,1,9

The Z-matrices might both look like this:

CA-1, CA-2, CA-CB-1, CA-CB-2, N-CA-CB-SG-1, N-CA-CB-SG-2
-16.801, 28.993, -1.189, -0.515, 118.093, 74.4629
-24.918, 27.398, -0.706, 0.989, 112.854, -175.458
-1.01, 37.855, 0.462, 1.442, 108.323, -72.2786
61.369, 113.576, 0.355, -1.127, 111.217, -69.8672

Samples is a dict{num => dict {num => dict {num => tuple(attributes, z_matrix)}}}, having one row of the z-matrix.

I have a list of samples with attributes and values, I'd like to organise the samples into arbitrary dictionaries by their attribute values. The way I'm doing it now works very well and is surprisingly quick, but it isn't very general, meaning I have to hardcode a new function every time I want to organise the samples differently. So right now it organises: dict[secstruc][front][back]. I'd like to be able to make a function that takes in those three columns as parameters and returns this dictionary using the custom groupby function. — calben, Nov 26 '13 at 04:19
What does your `attributes` value contain? It seems to be some kind of two level structure (perhaps a `list` of `dict`s?), but it is very unclear to me what the levels mean. — Blckknght, Nov 26 '13 at 04:33
attributes, z_matrix, and original_z_matrix are all lists of numpy arrays, representing samples for analysis, where attributes are the features and z_matrix is a set of numeric values (it's a z-matrix from biochemistry and chemistry) Editing sample input and output into the question! — calben, Nov 26 '13 at 13:23

score 1 · Answer 1 · edited May 23 '17 at 11:56

1

The list comprehension is a great structure for generalising working with lists in such a way that the creation of lists can be managed elegantly. Is there a similar tool for managing Dictionaries in Python?

Have you tries using dictionary comprehensions?

see this great question about dictionary comperhansions

edited May 23 '17 at 11:56

Community

1
1

answered Nov 26 '13 at 08:26

oz123

27,559
27
125
187

This is great, thanks! I'll work towards applying it and edit the question appropriately (or mark this as accepted answer) – calben Nov 26 '13 at 13:23
While Dictionary Comprehensions have proven beautiful, I haven't figured out a way to use them to make dictionaries n levels deep. Do you know a way to manipulate the dictionary comprehensions to make the comprehensions run arbitrarily many times depending on the input columns? – calben Nov 26 '13 at 19:44
@Kylamus, your question has a lot of information, but I still don't get what you want to do. Can you post what you want to do? – oz123 Nov 26 '13 at 20:24
Essentially, and in more general terms, I would like to be able to make a function that creates n layers of dictionaries. So if I gave the function 4 arguments it would make dict{k-> dict{k-> dict{k-> dict}}}. If I passed it 2 it would be dict{k-> dict}}. Does that make sense? I'm using dictionaries to organise columns of data and would like to be able to do this arbitrarily. This is not implemented in NumPy... yet. – calben Nov 27 '13 at 00:28
@Kylamus, what's `dict{k-> dict}}` and why is it so important to use `nested dictionaries` ? please re-edit your question and put inputs, call to the function and expected output. – oz123 Nov 27 '13 at 07:01

Elegantly Generalising Sorting into Dictionaries in Python?

1 Answers1