5

Ok, so I have a dictionary that contains positions of characters in a string, however, I've got individual strings inside the dictionary, which makes it look like this:

{
'24-35': 'another word',
'10-16': 'content',
'17': '[',
'22': ']',
'23': '{',
'37': ')',
'0-8': 'beginning',
'36': '}',
'18-21': 'word',
'9': '(',
}

I'm trying to sort this array by the keys, so that it'd look like this

{
'0-8': 'beginning',
'9': '(',
'10-16': 'content',
'17': '[',
'18-21': 'word',
'22': ']',
'23': '{',
'24-35': 'another word',
'36': '}',
'37': ')'
}

The dictionary is built by foreaching through this string:
'beginning(content[word]{another word})', and splitting at the brackets.

I'm trying to sort the dictionary by using @Brian's answer to this question, however, it sorts by the alphabetically (because they've been transformed to strings) at the ranging process (the thing that makes it say '0-8').

My question is:

How can I transform:

class SortedDisplayDict(dict):
   def __str__(self):
       return "{" + ", ".join("%r: %r" % (key, self[key]) for key in sorted(self)) + "}"

and more specifically: the sorted key into int(key.split('-')[0]), but still keep the output with ranges?

Community
  • 1
  • 1
Quill
  • 2,729
  • 1
  • 33
  • 44
  • 1
    Just curious: Why don't you build the string while (or even instead of) building the dictionary? – Stefan Pochmann May 24 '15 at 03:58
  • @StefanPochmann, I do build the string while building the array, for each character, if it isn't a bracket, it gets added to a variable, and once a bracket comes in, or the string ends, the string gets pushed to the dict. – Quill May 24 '15 at 04:04
  • 1
    @Quill I would say, you don't have to store the range at all, just the starting index. You can always find the ending index by adding the length of the string to the starting index. – thefourtheye May 24 '15 at 04:06
  • @Quill I don't mean the string parts that end up as your dictionary keys. I mean the large string that you're trying to build here. The one representing the whole dict. Why parse the parts in order, store them in a dict where they get unordered, and then take them out of the dict so you have to reorder them? Why not take the parts in original order and build the large final string directly? – Stefan Pochmann May 24 '15 at 04:11
  • Feel free to take a look at [my code on Code Review](http://codereview.stackexchange.com/questions/91613/build-a-dictionary-based-on-split-string). – Quill May 24 '15 at 04:25

3 Answers3

7

Use a key function for sorted that casts the value to an int. It'd look like:

sorted(d.items(), key=lambda v: int(v[0].split("-")[0]))

This logic would only be used for sorting; the items that sorted returned would still use range notation.

jwilner
  • 6,348
  • 6
  • 35
  • 47
  • Hi, this really helped me. However, is there a way we can account for both parts of the key, such as if we had keys as 8-9,8-4,4-9. Ideally, they would not just be ordered by the first number but by both such as: 4-9,8-4,8-9. – NJD Nov 14 '20 at 09:07
  • 1
    @NJD sure, just use a complex sorting key. `sorted(d.items(), key=lambda v: tuple(int(e) for e in v.split("-")))` – jwilner Nov 16 '20 at 20:44
4

You can convert the dictionary to a list of tuples (key-value pairs) and then sort them based on the second number in the key if it is hyphenated.

>>> data = {'24-35': 'another word', '10-16': 'content', '17':
...         '[', '22': ']', '23': '{', '37': ')', '0-8': 'beginning', '36': '}',
...         '18-21': 'word', '9': '('}
>>> from pprint import pprint
>>> pprint(sorted(data.items(),
...     key=lambda x: int(x[0].split("-")[1] if "-" in x[0] else x[0])))

[('0-8', 'beginning'),
 ('9', '('),
 ('10-16', 'content'),
 ('17', '['),
 ('18-21', 'word'),
 ('22', ']'),
 ('23', '{'),
 ('24-35', 'another word'),
 ('36', '}'),
 ('37', ')')]

Here, the key part is lambda x: int(x[0].split("-")[1] if "-" in x[0] else x[0]). We check if - is in the x[0], if it is present, then we split based on - and take the element at index 1, basically the number after -. If the key doesn't have - then convert the string as it is to int.


If you want to maintain the order of the items in the dictionary itself, then you can use collections.OrderedDict, like this

>>> from collections import OrderedDict
>>> d = OrderedDict(sorted(data.items(),
...         key=lambda x: int(x[0].split("-")[1] if "-" in x[0] else x[0])))
>>> for key, value in d.items():
...     print(key, value)
...     
... 
0-8 beginning
9 (
10-16 content
17 [
18-21 word
22 ]
23 {
24-35 another word
36 }
37 )
thefourtheye
  • 233,700
  • 52
  • 457
  • 497
1
>>> d = {
'24-35': 'another word',
'10-16': 'content',
'17': '[',
'22': ']',
'23': '{',
'37': ')',
'0-8': 'beginning',
'36': '}',
'18-21': 'word',
'9': '(',
}
>>> pprint(sorted(d.items(), key=lambda x: int(x[0].partition('-')[0])))
[('0-8', 'beginning'),
 ('9', '('),
 ('10-16', 'content'),
 ('17', '['),
 ('18-21', 'word'),
 ('22', ']'),
 ('23', '{'),
 ('24-35', 'another word'),
 ('36', '}'),
 ('37', ')')]
jamylak
  • 128,818
  • 30
  • 231
  • 230