1

Is it possible to use listdir to only call csv files?

src_files = os.listdir('M:/Dashboards/Team/Metrics', suffix=".csv")

If not, what is most efficient way to call csv files in one line?

ZJAY
  • 2,517
  • 9
  • 32
  • 51

5 Answers5

0

I think this is what you are searching for: python glob site

With glob you can filter the filename.

Your code could look like this then:

for file in glob.glob("*.csv"):
    print(file)

It isnt listdir but it works as well. I hope thats better than nothing ;)

Whoozy
  • 161
  • 17
0

from pathlib import Path

src_files = Path('M:/Dashboards/Team/Metrics').glob("*.csv")

Note that Path.glob() returns a generator. If you only need that directory once though, you could do

import glob

src_files = glob.glob('M:/Dashboards/Team/Metrics/*.csv')
EdvardM
  • 2,934
  • 1
  • 21
  • 20
  • Thanks. Is there any reason to have a preference for a generator vs. second solution? – ZJAY Apr 09 '20 at 16:28
  • That's a really good question :) Generators have the benefit of preserving memory and being lazily evaluated. Imagine a directory containing millions of files, of which you actually need only first two. Without generators, you need to build whole result in memory(!) before working on it, whereas generators produce only one thing at a time, but they are consumed once iterated through. See https://stackoverflow.com/questions/47789/generator-expressions-vs-list-comprehension – EdvardM Apr 11 '20 at 14:12
0

Hello,

So, first of all, os.listdir() will actually returns files and directories.

For just files use os.path then you can see the file's extension:

from os import listdir
from os.path import isfile, join
files = [f for f in listdir(path) if isfile(join(path, f))]
        if files:
            for each_file in files:
                if each_file.endswith(".csv"):

There are some examples of code ProgramCreek

0

For clean one liners, I suggest list comprehension:

the_dir = './'
file_list = [f for f in os.listdir(the_dir) if '.csv' in f]
Thom Ives
  • 3,642
  • 3
  • 30
  • 29
0

There are several ways you can achieve this:

  1. Using module glob with a wildcard
import glob
glob("/path/to/my/dir/*.csv")

Using timeit the performance is for my simple test

96.5 µs ± 10.8 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
  1. Using os module with listdir method and filter:
import os
[x for x in os.listdir("/path/to/my/dir/") if x.endswith(".csv")]

With performance

57.8 µs ± 772 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
  1. Using module os with walk method:
import os
[file for root, dirs, files in os.walk("/path/to/my/dir/") for file in files if file.endswith(".csv")]

With performance

386 µs ± 3.26 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

The listdir method seems to be the most efficient way, but keep in mind that the perfomance depends on number of files in the directory.

Richard Nemeth
  • 1,784
  • 1
  • 6
  • 16