882

I need to iterate through all .asm files inside a given directory and do some actions on them.

How can this be done in a efficient way?

Alan W. Smith
  • 24,647
  • 4
  • 70
  • 96
Itzik984
  • 15,968
  • 28
  • 69
  • 107

11 Answers11

1217

Python 3.6 version of the above answer, using os - assuming that you have the directory path as a str object in a variable called directory_in_str:

import os

directory = os.fsencode(directory_in_str)
    
for file in os.listdir(directory):
     filename = os.fsdecode(file)
     if filename.endswith(".asm") or filename.endswith(".py"): 
         # print(os.path.join(directory, filename))
         continue
     else:
         continue

Or recursively, using pathlib:

from pathlib import Path

pathlist = Path(directory_in_str).glob('**/*.asm')
for path in pathlist:
     # because path is object not string
     path_in_str = str(path)
     # print(path_in_str)
  • Use rglob to replace glob('**/*.asm') with rglob('*.asm')
    • This is like calling Path.glob() with '**/' added in front of the given relative pattern:
from pathlib import Path

pathlist = Path(directory_in_str).rglob('*.asm')
for path in pathlist:
     # because path is object not string
     path_in_str = str(path)
     # print(path_in_str)

Original answer:

import os

for filename in os.listdir("/path/to/dir/"):
    if filename.endswith(".asm") or filename.endswith(".py"): 
         # print(os.path.join(directory, filename))
        continue
    else:
        continue
Gulzar
  • 23,452
  • 27
  • 113
  • 201
anselm
  • 12,821
  • 2
  • 18
  • 18
  • 1
    This just seems to list the directories or files immediately under a directory. The answer by pedromateo below seems to do a recursive listing. – Jay Sheth Mar 31 '16 at 17:14
  • 12
    Please note that in Python 3.6 directory is expected to be in bytes and then listdir will spit out a list of filenames also in bytes data type so you cannot run endswith directly on it. This code block should be changed to `directory = os.fsencode(directory_in_str) for file in os.listdir(directory): filename = os.fsdecode(file) if filename.endswith(".asm") or filename.endswith(".py"): # print(os.path.join(directory, filename)) continue else: continue` – Kim Stacks May 20 '17 at 04:52
  • 15
    `print(os.path.join(directory, filename))` need to be changed to `print(os.path.join(directory_in_str, filename))` to get it to work in python 3.6 – Hugo Koopmans Jun 03 '17 at 21:17
  • On the third example, why not just use `.rglob("*.asm")` to check recursively. – EndermanAPM Nov 23 '17 at 08:27
  • @KimStacks, check docs, it supports both string and bytes, returning value type will depend on it: https://docs.python.org/3/library/os.html#os.listdir – vvkuznetsov Nov 27 '17 at 10:32
  • 110
    If you're seeing this in 2017 or beyond, os.scandir(dir_str) is now available and much cleaner to use. No need for fsencode. `for entry in os.scandir(path): print(entry.path)` – g.o.a.t. Dec 21 '17 at 00:49
  • 3
    Prefer `if filename.endswith((".asm", ".py")):` to `if filename.endswith(".asm") or filename.endswith(".py"):` – Maroloccio Jan 20 '18 at 14:13
  • 1
    @g.o.a.t. where's the answer including an example so I can upvote it? :) – jeromej Jun 25 '18 at 17:06
  • 1
    Note that you may use [`Path.rglob(pattern)`](https://docs.python.org/3.6/library/pathlib.html#pathlib.Path.rglob) instead. This is like calling `Path.glob()` with `“**”` added in front of the given pattern. – Константин Ван Jul 17 '18 at 23:08
  • In the first solution, are you able to use np.loadtxt ? For example: for filename in os.listdir('./'): if filename.endswith(".hrt"): Sin,Sout,IR = np.genfromtxt(filename,skip_header=11,usecols=(5,7,9),delimiter=' ', unpack = True) – JadeChee Sep 26 '18 at 23:05
  • 4
    Python 3.7+ : remove the line directory = os.fsencode(directory_in_str) as was mentioned here: https://stackoverflow.com/questions/48729364/python-typeerror-cant-mix-strings-and-bytes-in-path-components – justi Oct 29 '18 at 16:37
  • Is it necessary to put the keyword `code`(continue) inside the if condition? – Lahiru Karunaratne Nov 05 '18 at 06:59
  • Remember to put an `r` before the path string or escape the backslash characters. – LoMaPh Mar 11 '19 at 18:33
  • In the second example, use `file` instead of `filename` in the print statement. – Aaron Nov 14 '19 at 06:05
  • @g.o.a.t. Pycharm tells me entry.path is unrecognized. Any way to let it know the correct type of `entry`? – Gulzar Dec 30 '21 at 13:49
  • Why the `continue`s? – user118967 Jun 20 '22 at 04:21
215

This will iterate over all descendant files, not just the immediate children of the directory:

import os

for subdir, dirs, files in os.walk(rootdir):
    for file in files:
        #print os.path.join(subdir, file)
        filepath = subdir + os.sep + file

        if filepath.endswith(".asm"):
            print (filepath)
Flimm
  • 136,138
  • 45
  • 251
  • 267
pedromateo
  • 2,965
  • 1
  • 18
  • 19
  • 5
    A reference for the os.walk function is found at the following: https://docs.python.org/2/library/os.path.html#os.path.walk – ScottMcC Jan 31 '17 at 06:51
182

You can try using glob module:

import glob

for filepath in glob.iglob('my_dir/*.asm'):
    print(filepath)

and since Python 3.5 you can search subdirectories as well:

glob.glob('**/*.txt', recursive=True) # => ['2.txt', 'sub/3.txt']

From the docs:

The glob module finds all the pathnames matching a specified pattern according to the rules used by the Unix shell, although results are returned in arbitrary order. No tilde expansion is done, but *, ?, and character ranges expressed with [] will be correctly matched.

Brian Burns
  • 20,575
  • 8
  • 83
  • 77
Doboy
  • 10,411
  • 11
  • 40
  • 48
  • For people confused about the ** pattern, this is from the documentation https://docs.python.org/3/library/glob.html : "If recursive is true, the pattern “**” will match any files and zero or more directories, subdirectories and symbolic links to directories." – rmutalik May 02 '23 at 13:33
89

Since Python 3.5, things are much easier with os.scandir() and 2-20x faster (source):

with os.scandir(path) as it:
    for entry in it:
        if entry.name.endswith(".asm") and entry.is_file():
            print(entry.name, entry.path)

Using scandir() instead of listdir() can significantly increase the performance of code that also needs file type or file attribute information, because os.DirEntry objects expose this information if the operating system provides it when scanning a directory. All os.DirEntry methods may perform a system call, but is_dir() and is_file() usually only require a system call for symbolic links; os.DirEntry.stat() always requires a system call on Unix but only requires one for symbolic links on Windows.

Neuron
  • 5,141
  • 5
  • 38
  • 59
crypdick
  • 16,152
  • 7
  • 51
  • 74
  • 1
    `entry` is a [posix.DirEntry](https://docs.python.org/3/library/os.html#os.DirEntry) type with a bunch of handy methods such as `entry.is_dir()`, `is_file()`, `is_symlink()` – crypdick Jun 07 '19 at 15:04
  • @tejasvi88 otherwise you need to call `scandir.close()` explicitly to close the iterator and free acquired resources – crypdick Jul 26 '21 at 17:22
  • I got `Expected type 'collections.Iterable', got 'DirEntry[AnyStr]' instead` warning on `for entry in it`. – liushuaikobe Nov 28 '22 at 09:21
26

Python 3.4 and later offer pathlib in the standard library. You could do:

from pathlib import Path

asm_pths = [pth for pth in Path.cwd().iterdir()
            if pth.suffix == '.asm']

Or if you don't like list comprehensions:

asm_paths = []
for pth in Path.cwd().iterdir():
    if pth.suffix == '.asm':
        asm_pths.append(pth)

Path objects can easily be converted to strings.

Flimm
  • 136,138
  • 45
  • 251
  • 267
Greg
  • 1,894
  • 18
  • 11
14

Here's how I iterate through files in Python:

import os

path = 'the/name/of/your/path'

folder = os.fsencode(path)

filenames = []

for file in os.listdir(folder):
    filename = os.fsdecode(file)
    if filename.endswith( ('.jpeg', '.png', '.gif') ): # whatever file types you're using...
        filenames.append(filename)

filenames.sort() # now you have the filenames and can do something with them

NONE OF THESE TECHNIQUES GUARANTEE ANY ITERATION ORDERING

Yup, super unpredictable. Notice that I sort the filenames, which is important if the order of the files matters, i.e. for video frames or time dependent data collection. Be sure to put indices in your filenames though!

Daniel McGrath
  • 454
  • 5
  • 6
  • Not always sorted... _im1,im10,im11..., im2..._ Otherwise useful approach. `from pkg_resources import parse_version` and `filenames.sort(key=parse_version)` did it. – Hastur Apr 16 '20 at 23:10
9

You can use glob for referring the directory and the list :

import glob
import os

#to get the current working directory name
cwd = os.getcwd()
#Load the images from images folder.
for f in glob.glob('images\*.jpg'):   
    dir_name = get_dir_name(f)
    image_file_name = dir_name + '.jpg'
    #To print the file name with path (path will be in string)
    print (image_file_name)

To get the list of all directory in array you can use os :

os.listdir(directory)
YAP
  • 451
  • 5
  • 6
4

I'm not quite happy with this implementation yet, I wanted to have a custom constructor that does DirectoryIndex._make(next(os.walk(input_path))) such that you can just pass the path you want a file listing for. Edits welcome!

import collections
import os

DirectoryIndex = collections.namedtuple('DirectoryIndex', ['root', 'dirs', 'files'])

for file_name in DirectoryIndex(*next(os.walk('.'))).files:
    file_path = os.path.join(path, file_name)
ThorSummoner
  • 16,657
  • 15
  • 135
  • 147
3

I really like using the scandir directive that is built into the os library. Here is a working example:

import os

i = 0
with os.scandir('/usr/local/bin') as root_dir:
    for path in root_dir:
        if path.is_file():
            i += 1
            print(f"Full path is: {path} and just the name is: {path.name}")
print(f"{i} files scanned successfully.")
james-see
  • 12,210
  • 6
  • 40
  • 47
2

I don't understand why some answers are complicated. This is how I would do it with Python 2.7. Replace DIRECTORY_TO_LOOP with the directory you want to use.

import os
DIRECTORY_TO_LOOP = '/var/www/files/'
for root, dirs, files in os.walk(DIRECTORY_TO_LOOP, topdown=False):
   for name in files:
      print(os.path.join(root, name))
George Chalhoub
  • 14,968
  • 3
  • 38
  • 61
1

Get all the .asm files in a directory by doing this.

import os

path = "path_to_file"
file_type = '.asm'

for filename in os.listdir(path=path):
    if filename.endswith(file_type):
        print(filename)
        print(f"{path}/{filename}")
        # do something below