1

How do I make Python sorting behave like sort -n from GNU coreutils?

This is my images.txt:

Vol. 1/Vol. 1 - Special 1/002.png
Vol. 1/Chapter 2 example text/002.png
Vol. 1/Vol. 1 Extra/002.png
Vol. 1/Chapter 2 example text/001.png
Vol. 1/Vol. 1 Extra/001.png
Vol. 1/Chapter 1 example text/002.png
Vol. 1/Vol. 1 - Special 1/001.png
Vol. 1/Chapter 1 example text/001.png

When I run this Bash script:

#!/bin/bash

cat images.txt | sort -n

I get the following output:

Vol. 1/Chapter 1 example text/001.png
Vol. 1/Chapter 1 example text/002.png
Vol. 1/Chapter 2 example text/001.png
Vol. 1/Chapter 2 example text/002.png
Vol. 1/Vol. 1 Extra/001.png
Vol. 1/Vol. 1 Extra/002.png
Vol. 1/Vol. 1 - Special 1/001.png
Vol. 1/Vol. 1 - Special 1/002.png

But when I run this Python script:

#!/usr/bin/env python3

images = []

with open("images.txt") as images_file:
    for image in images_file:
        images.append(image)

images = sorted(images)

for image in images:
    print(image, end="")

I get the following output, which is not what I need:

Vol. 1/Chapter 1 example text/001.png
Vol. 1/Chapter 1 example text/002.png
Vol. 1/Chapter 2 example text/001.png
Vol. 1/Chapter 2 example text/002.png
Vol. 1/Vol. 1 - Special 1/001.png
Vol. 1/Vol. 1 - Special 1/002.png
Vol. 1/Vol. 1 Extra/001.png
Vol. 1/Vol. 1 Extra/002.png

How do I achieve the same result with Python that I achieve with Bash and sort -n?

  • Possible duplicate of [Does Python have a built in function for string natural sort?](https://stackoverflow.com/questions/4836710/does-python-have-a-built-in-function-for-string-natural-sort) – roganjosh Oct 29 '18 at 17:35
  • 1
    I'm not sure there's an in-built method, seems like `-` is treated differently. Maybe `sorted(L, key=lambda x: x.replace(' - ', ' '))` ? – jpp Oct 29 '18 at 17:36
  • I am missing the "numerical" part of your question. The numbers are sorted equally in both examples. – Jongware Oct 29 '18 at 17:36
  • 1
    @jpp Thank you, that works. Can you please submit your solution as an answer so I can accept it? – user9593274 Oct 29 '18 at 17:44

2 Answers2

2

While I'm no expert, it seems '-' is ordered differently versus Python. A quick fix for this particular issue is to replace ' - ' with ' ' when sorting:

L = sorted(L, key=lambda x: x.replace(' - ', ' '))
jpp
  • 159,742
  • 34
  • 281
  • 339
0

You might also want to consider a lambda which replaces all non alpha numeric characters

images = sorted(images, key=lambda x: re.sub('[^A-Za-z\d]+', '', x))