1

I'm trying to loop through files I have, and would like to put every two files in a pair, especially that every two files coming after each other are actually related.

I have the files sorted in my directory, and I used the following to loop through the directory and read the pairs of files:

for root, dirs, files in os.walk(TRAIN_DIR):
        for file1, file2 in itertools.izip_longest(files[::2], files[1::2]):

However, I receive file1 and file2 in different orders, and not those two files that should come immediately after each other as in the directory. Does os.walk then return unsorted files? What should I do in order to walk through the files in a sorted order?

This is how my first four files are listed in my system:

0a1a465c-a28d-4926-8a79-81ba83408c52.1.a
0a1a465c-a28d-4926-8a79-81ba83408c52.2.a
0a1b8b67-6c03-47c6-9af9-0e0091148e06.1.a
0a1b8b67-6c03-47c6-9af9-0e0091148e06.2.a

How can I read them in that order?

Thanks.

Simplicity
  • 47,404
  • 98
  • 256
  • 385
  • If they are not returned in the same order that they display in the file system, then by what condition are they ordered in this manner? – cs95 Feb 15 '18 at 08:29
  • @cᴏʟᴅsᴘᴇᴇᴅ In my system (Mac OS X), they are arranged by name – Simplicity Feb 15 '18 at 08:30
  • If I were to put these names in a list, shuffle them, and then call `files.sort()`, it gives me exactly what you are asking for. – cs95 Feb 15 '18 at 08:32
  • @cᴏʟᴅsᴘᴇᴇᴅ Did you try print file1 and print file2. Here, I get them returned in different orders – Simplicity Feb 15 '18 at 08:35
  • I'm on python3.6, and I tried printing file1 and file2, they print as expected. Note that in python3 it's `zip_longest`. – cs95 Feb 15 '18 at 08:36

2 Answers2

1

as here os.walk iterates in what order? written you can add the sort() method before the second loop:

for root, dirs, files in os.walk(TRAIN_DIR):
    files.sort()
    for file1, file2 in itertools.izip_longest(files[::2], files[1::2]):
debiandado
  • 46
  • 3
  • the builtin function [`zip`](https://docs.python.org/library/functions.html#zip) is enough here – Skandix Feb 15 '18 at 09:18
  • you re right. Maybe he decided to use izip_longest() because it is a memory efficient method? [link](https://docs.python.org/2/library/itertools.html) – debiandado Feb 16 '18 at 09:04
0

os.walk does not yield files in any order, as files do not have an order. It's you (or your operating system) that gives them an order by arranging them according to some certain properties: by name, by creation date, by owning user, ...

If you want to access your files sorted by some property, then first retrieve them into a list, sort that list, and go on with processing afterwards.

jbndlr
  • 4,965
  • 2
  • 21
  • 31