89

How can I find all files in directory with the extension .csv in python?

Gonçalo Peres
  • 11,752
  • 3
  • 54
  • 83
mintgreen
  • 895
  • 1
  • 6
  • 6
  • 4
    Looks like a duplicate of http://stackoverflow.com/questions/3964681/find-all-files-in-directory-with-extension-txt-with-python – Danny Feb 10 '12 at 20:47
  • Possible duplicate of [Find all files in a directory with extension .txt in Python](https://stackoverflow.com/questions/3964681/find-all-files-in-a-directory-with-extension-txt-in-python) – Ronak Shah Sep 12 '18 at 04:27

13 Answers13

102
import os
import glob

path = 'c:\\'
extension = 'csv'
os.chdir(path)
result = glob.glob('*.{}'.format(extension))
print(result)
Community
  • 1
  • 1
thclpr
  • 5,778
  • 10
  • 54
  • 87
  • 3
    This a short solution, but note, that this only scans in the current directory (where your script is running). To change that use `os.chdir("/mydir")`, as provided here: http://stackoverflow.com/questions/3964681/find-all-files-in-directory-with-extension-txt-in-python – ppasler Dec 02 '16 at 13:00
  • 3
    @ppasler Hi, Answer edited with your sugestion. Also i think now it's more pythonic :) – thclpr Dec 02 '16 at 15:06
  • 4
    Isn't there a way to do this without changing the directory? Can't we specify the directory as part of the glob command itself? – Nav Jan 14 '22 at 09:57
59
from os import listdir

def find_csv_filenames( path_to_dir, suffix=".csv" ):
    filenames = listdir(path_to_dir)
    return [ filename for filename in filenames if filename.endswith( suffix ) ]

The function find_csv_filenames() returns a list of filenames as strings, that reside in the directory path_to_dir with the given suffix (by default, ".csv").

Addendum

How to print the filenames:

filenames = find_csv_filenames("my/directory")
for name in filenames:
  print name
Bernhard Kausler
  • 5,119
  • 3
  • 32
  • 36
  • i'm having a problem with what im doing with this code im trying to display all the content in th directory using, csv = csv.reader(open(filenames, 'rb')) and its giving me an error" coercing to unicode: need string or buffer"can you help me out here please thanks alot if you can i'll apreciate it. – mintgreen Feb 14 '12 at 17:15
33

By using the combination of filters and lambda, you can easily filter out csv files in given folder.

import os

all_files = os.listdir("/path-to-dir")    
csv_files = list(filter(lambda f: f.endswith('.csv'), all_files))

# lambda returns True if filename (within `all_files`) ends with .csv or else False
# and filter function uses the returned boolean value to filter .csv files from list files.
Carlos Azevedo
  • 660
  • 3
  • 13
Thejesh PR
  • 935
  • 9
  • 14
10

use Python OS module to find csv file in a directory.

the simple example is here :

import os

# This is the path where you want to search
path = r'd:'

# this is the extension you want to detect
extension = '.csv'

for root, dirs_list, files_list in os.walk(path):
    for file_name in files_list:
        if os.path.splitext(file_name)[-1] == extension:
            file_name_path = os.path.join(root, file_name)
            print file_name
            print file_name_path   # This is the full path of the filter file
Rajiv Sharma
  • 6,746
  • 1
  • 52
  • 54
7

I had to get csv files that were in subdirectories, therefore, using the response from tchlpr I modified it to work best for my use case:

import os
import glob

os.chdir( '/path/to/main/dir' )
result = glob.glob( '*/**.csv' )
print( result )
rs77
  • 8,737
  • 2
  • 19
  • 19
4
import os

path = 'C:/Users/Shashank/Desktop/'
os.chdir(path)

for p,n,f in os.walk(os.getcwd()):
    for a in f:
        a = str(a)
        if a.endswith('.csv'):
            print(a)
            print(p)

This will help to identify path also of these csv files

Suraj Rao
  • 29,388
  • 11
  • 94
  • 103
4

Use the python glob module to easily list out the files we need.

import glob
path_csv=glob.glob("../data/subfolrder/*.csv")
mpx
  • 3,081
  • 2
  • 26
  • 56
3

While solution given by thclpr works it scans only immediate files in the directory and not files in the sub directories if any. Although this is not the requirement but just in case someone wishes to scan sub directories too below is the code that uses os.walk

import os
from glob import glob
PATH = "/home/someuser/projects/someproject"
EXT = "*.csv"
all_csv_files = [file
                 for path, subdir, files in os.walk(PATH)
                 for file in glob(os.path.join(path, EXT))]
print(all_csv_files)

Copied from this blog.

Suraj
  • 31
  • 1
3

You could just use glob with recursive = true, the pattern ** will match any files and zero or more directories, subdirectories and symbolic links to directories.

import glob, os

os.chdir("C:\\Users\\username\\Desktop\\MAIN_DIRECTORY")

for file in glob.glob("*/.csv", recursive = true):
    print(file)
Gonçalo Peres
  • 11,752
  • 3
  • 54
  • 83
2

This solution uses the python function filter. This function creates a list of elements for which a function returns true. In this case, the anonymous function used is partial matching '.csv' on every element of the directory files list obtained with os.listdir('the path i want to look in')

import os

filepath= 'filepath_to_my_CSVs'  # for example: './my_data/'

list(filter(lambda x: '.csv' in x, os.listdir('filepath_to_my_CSVs')))
Jordi Aceiton
  • 141
  • 1
  • 4
2

Many (linked) answers change working directory with os.chdir(). But you don't have to.

Recursively print all CSV files in /home/project/ directory:

pathname = "/home/project/**/*.csv"

for file in glob.iglob(pathname, recursive=True):
    print(file)

Requires python 3.5+. From docs [1]:

  • pathname can be either absolute (like /usr/src/Python-1.5/Makefile) or relative (like ../../Tools/*/*.gif)
  • pathname can contain shell-style wildcards.
  • Whether or not the results are sorted depends on the file system.
  • If recursive is true, the pattern ** will match any files and zero or more directories, subdirectories and symbolic links to directories

[1] https://docs.python.org/3/library/glob.html#glob.glob

Paul
  • 3,920
  • 31
  • 29
2

You could just use glob with recursive = True, the pattern ** will match any files and zero or more directories, subdirectories and symbolic links to directories.

import glob, os

os.chdir("C:\\Users\\username\\Desktop\\MAIN_DIRECTORY")

for file in glob.glob("*/*.csv", recursive = True):
    print(file)
0

Please use this tested working code. This function will return a list of all the CSV files with absolute CSV file paths in your specified path.

import os
from glob import glob

def get_csv_files(dir_path, ext):
    os.chdir(dir_path)
    return list(map(lambda x: os.path.join(dir_path, x), glob(f'*.{ext}')))

print(get_csv_files("E:\\input\\dir\\path", "csv"))