4

Book sections are usually numbered as x.x.x, such as 1.2.3. How do I sort a list of section numbers?

Store section numbers as a list of strings.

# a list of strings, section numbers
ls = ['1.1', '1.10', '1.2', '1.2.3', '1.2.1', '1.9']    

lists = sorted([s.split('.') for s in ls], key=lambda x:map(int, x))    
# [['1', '1'], ['1', '2'], ['1', '2', '1'], ['1', '2', '3'], ['1', '9'], ['1', '10']]

r = ['.'.join(sublist) for sublist in lists]    
#['1.1', '1.2', '1.2.1', '1.2.3', '1.9', '1.10']

However, my expecting result is,

['1.1', '1.10', '1.2', '1.2.1', '1.2.3', '1.9']
SparkAndShine
  • 17,001
  • 22
  • 90
  • 134
  • 4
    `1.10` is not a section number, it is a float. If you want an object that represents a section number, create a class for it. Using a float for this is a terrible idea. – Vincent Savard Jun 02 '16 at 15:12
  • 1
    Wrong input data type. The section numbers are not floats, they are more like polynomical coefficents that sort lexicographically. When using floats, `1.1` is semantically equal to `1.10`, but this is not what you want. Keep the values as strings and sort by the split. Or even better: Create a proper type. – dhke Jun 02 '16 at 15:13
  • @VincentSavard, I see. `1.10` represents Chapter `1`, Section `10`. – SparkAndShine Jun 02 '16 at 15:16
  • 2
    Why not store these as strings in the first place? Using floats for this is causing this whole problem. `1.10 == 1.1` and you can't make that not true as long as you're using numbers. This is not numeric data. – Two-Bit Alchemist Jun 02 '16 at 15:17
  • floats have already been dis-recommended, but here's another nail in the coffin: how are you going to represent Chapter 1, Section 2, Subsection 3? – Kevin Jun 02 '16 at 15:17
  • 1
    @sparkandshine _I_ know what you meant, _Python_ does not. That's why types are important. Your intent can be made clearer by using an actual class that encodes a section number. – Vincent Savard Jun 02 '16 at 15:18
  • are you sure you're not talking about string in lf ? – Colonel Beauvel Jun 02 '16 at 15:18
  • @Kevin, `1.2.3`. I need to edit my question. – SparkAndShine Jun 02 '16 at 15:18
  • 1
    @sparkandshine That's not going to work. That's a SyntaxError! – Two-Bit Alchemist Jun 02 '16 at 15:19

3 Answers3

9

Use a custom compare function that converts the strings into sub-lists of integers. Those will sort correctly without problems.

In [4]: ls = ['1.1', '1.10', '1.2', '1.2.3', '1.2.1', '1.9']

In [5]: def section(s):
   ...:     return [int(_) for _ in s.split(".")]
   ...:

In [6]: sorted(ls, key=section)
Out[6]: ['1.1', '1.2', '1.2.1', '1.2.3', '1.9', '1.10']
Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
  • 1
    Can i use `lambda` like `sorted(ls, key=lambda x: [int(_) for _ in x.split('.')])`? – qvpham Jun 02 '16 at 15:30
  • 1
    @julivico: Sure, that's essentially the same thing. – Tim Pietzcker Jun 02 '16 at 15:31
  • @TimPietzcker is there a way to use this funciton with the zip function. I know this is incorrect, but it would look similar to this result = list(zip(*sorted(zip(l1 key = section, l2, l3, files_good_list), key=lambda x:float(x[0])))) – Jstuff Jun 02 '16 at 15:57
  • @Jstuff: So you want to zip four lists together and sort all of them in the same order as `l1`? And what is the second key used for? – Tim Pietzcker Jun 02 '16 at 16:20
  • Yes, but l1 is a list of strings with floating values. If I just do result = list(zip(*sorted(zip(l1, l2, l3, files_good_list), key=lambda x:float(x[0])))). It will sort l1 as 1.1, 1.10, 1.2 not 1.1, 1.2, 1.10. So I need the second key to sort it as you did in the answer to the question. The question you answered originated because of this post (http://stackoverflow.com/questions/37592787/combing-2d-list-of-tuples-and-then-sorting-them-in-python/37593267?noredirect=1#comment62674466_37593267) Perhaps that will make things clearer. – Jstuff Jun 02 '16 at 16:25
4

Book sections are usually numbered as x.x.x

Why not store the section numbers as tuples?

sections = [(2, 4, 1), (1, 10, 3),(1, 2, 1), (1, 1, 10), (1, 2, 3), (1, 4, 6)]

print(sorted(sections)) 

gives [(1, 1, 10), (1, 2, 1), (1, 2, 3), (1, 4, 6), (1, 10, 3), (2, 4, 1)]

Klaus-Dieter Warzecha
  • 2,265
  • 2
  • 27
  • 33
4

As by your comments, float is not the datatype you need. In your case, you have an actual hierarchy of chapters/sections.

One simple (and remember, simple is better than complex) way is to represent the section numbers as tuples. Since tuples are sorted lexicographically, they naturally sort in the desired order:

>>> lf = [(1, ), (1, 1), (1, 10), (1, 3), (1, 4), (1, 5), (1, 6), (1, 7), (1, 8), (2, ), (1, 9)]
>>> sorted(lf)
[(1, ), (1, 1), (1, 3), (1, 4), (1, 5), (1, 6), (1, 7), (1, 8), (1, 9), (1, 10), (2, )]

As we can see, this also works for tuples with varying lengths.

If want to keep the sections as strings, natsort does a fine job of handling dotted values, too:

>>> s = ['1', '1.1', '1.10', '1.2']
>>> natsort.natsorted(s)
['1', '1.1', '1.2', '1.10']

You can also define your own SectionNumber class, but that's probably overkill.

dhke
  • 15,008
  • 2
  • 39
  • 56