Sorting tuples based on another list

Question

I am working with clustered data that is being generated by SciPy, and would love to order my data with a custom sort order.

Let's say that my data comes out looking like this:

leafIDs = [4,5,3,1,2]
rowHeaders = ['lorem','ipsum','dolor','sit','amet']

There is a one-to-one correspondence between the two lists, leafIDs and rowHeaders. Both will always be the same length. For example, the row with the header lorem will have a leaf ID of 4, ipsum will have an ID of 5 and so on. Note that the leafIDs are not the order I wanted to sort them in (otherwise I can use the tried and tested method). The intended one-to-one correspondence can be visualised as follow:

+---------+------------+
| leafIDs | rowHeaders |
+---------+------------+
|       4 | lorem      |
|       5 | ipsum      |
|       3 | dolor      |
|       1 | sit        |
|       2 | amet       |
+---------+------------+

Now I would like to sort these two arrays by a custom order, which is again, will always be the same length as both aforementioned lists. You can see it as a scrambled order of rowHeaders:

rowHeaders_custom = ['amet','lorem','sit','ipsum','dolor']

The desired outcome, where leafIDs will be sorted based on rowHeaders_custom and its one-to-one relationship with rowHeaders, i.e.:

# Desired outcome
leafIDs_custom = [2,4,1,5,3]

What I've tried so far: my approach currently is as follow:

Zip leafIDs and rowHeaders, i.e. zippedRows = zip(leafIDs, rowHeaders).
Attempt to sort the list of tuples by the list rowHeaders_custom.

However, I am hitting a roadblock on the second step. It would nice if there are any suggestions on how to perform this custom ordered sort. I understand I might be hitting an XY problem by attempting to order a list of tuples with another list, but my understanding of sort() is rather limited.

@PadraicCunningham - That question was mentioned by the OP as being insufficient, and I'm pretty sure it is. — TigerhawkT3, Dec 23 '15 at 00:56
It doesn't account for the extra required lookup. And, again, the OP mentioned it and said it didn't fully address his question, which is why he asked a new question, as the dup boilerplate instructs. — TigerhawkT3, Dec 23 '15 at 00:58

score 4 · Accepted Answer · answered Dec 22 '15 at 23:41

4

What if you make a dict out of the zippedRows? I.e.

>>> dict(zip(rowHeaders, leafIDs))
{'ipsum': 5, 'sit': 1, 'lorem': 4, 'amet': 2, 'dolor': 3}

Capturing that, then:

dictRows = dict(zip(rowHeaders, leafIDs))

You could just pull the values out of that:

leafIDs_custom = [dictRows[v] for v in rowHeaders_custom]

I don't know, there might be a more pythonic way to do it, but that's the solution I'm coming up with.

answered Dec 22 '15 at 23:41

Linus Thiel

38,647
9
109
104

Thank you, it worked perfectly! Never thought of using `dict`, actually—that's a rather ingenious solution. Also, `leafIDs_custom = [dictRows[v] for v in rowHeaders_custom]` is pretty pythonic already ;) – Terry Dec 22 '15 at 23:52

Pynchia · Answer 2 · 2015-12-23T00:19:10.170

I presume you have several rows to rearrange, not just one.

Here is a solution that performs the translation of the columns only once, without building a mapping for every row (tuple) to be sorted. After all, the destinations remain the same.

It marks the original position of the headers and then builds the rearranged tuples picking from such locations

leaf_lst = [(4,5,3,1,2), (1,2,3,4,5), (6,7,8,9,0)]
rowHeaders = ['lorem','ipsum','dolor','sit','amet']
rowHeaders_custom = ['amet','lorem','sit','ipsum','dolor']

old_pos = tuple(rowHeaders.index(h) for h in rowHeaders_custom)
leaf_lst_custom  = [tuple(t[p] for p in old_pos) for t in leaf_lst]
print(leaf_lst_custom)

produces

[(2, 4, 1, 5, 3), (5, 1, 4, 2, 3), (0, 6, 9, 7, 8)]

Thanks for the answer! Your code is extremely useful if I have arrays of tuples, but my situation isn't as complicated as you figured, so it doesn't really require any further extensibility :) — Terry, Dec 23 '15 at 01:41

Sorting tuples based on another list

2 Answers2