How to read many of .CSV files with different names in python?

Question

Consider I have 1000 .CSV files with names of my employees. So there isn't any order or numbers in file names. Is there a way to say to the computer in Python language, that read files from first to end in a special folder, whit no matter what their name is? (It's not important for me the data is for who, I only need to grab these data for analyzing).

Answer: Yes, it's possible to read all *.csv files in a given folder with Python. — Mike Scotty, Feb 14 '19 at 09:19
check this answer https://stackoverflow.com/a/9251091/7053679 — Nihal, Feb 14 '19 at 09:22

madik_atma · Answer 1 · 2019-02-14T10:17:57.587

4

You can read all csv files in a directory like this:

My csv:

col1,col2,col3
a,b,c
d,e,f

Code:

import glob
import csv

PATH = "/Users/stack/"

for file in glob.glob(PATH+"*.csv"):
    with open(file) as csvfile:
        spamreader = csv.reader(csvfile, delimiter=',')
        for row in spamreader:
            print(" ".join(row))

Output:

col1 col2 col3
a b c
d e f

Process finished with exit code 0

edited Feb 14 '19 at 10:17

answered Feb 14 '19 at 09:22

madik_atma

787
10
28

Thanks madik, I tried your code but at this line `PATH = "C:\Users\m\Desktop\TSE" ` I get `SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape` – Hasani Feb 14 '19 at 09:58
1

Check pls this question: https://stackoverflow.com/questions/2953834/windows-path-in-python – madik_atma Feb 14 '19 at 10:00
The error solved by this addaption `PATH = r"C:\Users\m\Desktop\TSE" ` and I can build the program with no errors, but there is no special output in other than `[Finished in 0.1s]`. How can I see the contents of read files? – Hasani Feb 14 '19 at 10:12
May you do what you did with `pandas` library too? – Hasani Feb 14 '19 at 10:14
check pls the updated answer @user3486308 – madik_atma Feb 14 '19 at 10:18
I don't know what is the problem, but still no errors and no results! – Hasani Feb 14 '19 at 10:31
can you post a snippet of your csv file? @user3486308 – madik_atma Feb 14 '19 at 10:45
I tried many files also created a file like your example but didn't work! – Hasani Feb 14 '19 at 10:57
what is your delimiter in the files? can you post pls a example of your csvs? – madik_atma Feb 14 '19 at 11:02
OK, I uploaded 2 of them(your example and one of my own files) as a zip file here: http://uupload.ir/view/k592_csv.rar/ . push the green button to download – Hasani Feb 14 '19 at 11:09
I cannot open this. Pls paste the first 10 rows of a csv here: https://pastebin.com – madik_atma Feb 14 '19 at 11:11
https://pastebin.com/HHjjx8i4 – Hasani Feb 14 '19 at 11:26
it seems you have a space as a delimiter, so take pls this: delimiter=' ' – madik_atma Feb 14 '19 at 11:31

Pradeep Pandey · Accepted Answer · 2019-02-14T11:56:11.213

2

use code like:(replace current path (.) with your path:

import os, fnmatch
import csv
listOfFiles = os.listdir('.')  
pattern = "*.csv"  
for entry in listOfFiles:  
    if fnmatch.fnmatch(entry, pattern):
        with open(entry, newline='') as csvfile:
            spamreader = csv.reader(csvfile)
            for line in spamreader:
                print(line)

##########Using Danadas package

import os, fnmatch
import pandas as pd

listOfFiles = os.listdir('.')  
pattern = "*.csv"  
for entry in listOfFiles:  
    if fnmatch.fnmatch(entry, pattern):
        read_File_as_DF=pd.read_csv(entry)
        print(read_File_as_DF)

edited Feb 14 '19 at 11:56

answered Feb 14 '19 at 10:14

Pradeep Pandey

307
2
7

It seems your code read `.txt` files not `.csv` files. – Hasani Feb 14 '19 at 11:02
change the .txt it will work for csv as well – Pradeep Pandey Feb 14 '19 at 11:17
Great, it worked! Can you write also a pandas version? I think it's simpler to understand and work for beginners on Python! – Hasani Feb 14 '19 at 11:29
One line of outputs that I get with your code is like this `['13971115', '1020002.00', '1020002', '1020002', '1020002.00', '1020002.00', '1021098', '130', '1'] ` . How can I delete ` ' ' ` quotes? Also save this read data as a matrix? (I have `numpy` for matrix working). – Hasani Feb 14 '19 at 11:36
I like to save the read data as a matrix with rows like this `[13971115, 1020002.00, 1020002, 1020002, 1020002.00, 1020002.00, 1021098, 130, 1]` with no quotes. – Hasani Feb 14 '19 at 11:38
using pandas import os, fnmatch import pandas as pd listOfFiles = os.listdir('.') pattern = "*.csv" for entry in listOfFiles: if fnmatch.fnmatch(entry, pattern): read_File_as_DF=pd.read_csv(entry) print(read_File_as_DF) – Pradeep Pandey Feb 14 '19 at 11:55
1

use pandas version it will be resolved, just use separator rightly and qua toes if have in read_csv function – Pradeep Pandey Feb 14 '19 at 11:57

aaaakshat · Answer 3 · 2019-02-14T09:45:42.980

Yes you can. I would use a simple regex based tester to check the file and so essentially what you're doing is you are using a for loop to go through the directory and using the if-statement, we test the file to see if it contains '.csv'. After this we open the file and we simply append it to our output which you can either choose to analyse or store as a file. I've commented out the option of outputting to a file, however if you wish to you could.

import re

# Redefine this to the path of your folder:
folderPath = "SET UNIX PATH HERE"

output = None
for file in os.listdir(folderPath):
    if re.search(r'.csv', file):
        with open(file, r) as readFile:
            output += readFile.read()

# Uncomment this part if you would like to store the output to a file
# Define the path to the file that will be created:
# outputFilePath = "SET UNIX PATH"
# with open(outputFilePath, w+) as outputFile:
#     outputFile.write(output)

Hope this helps :)

How to read many of .CSV files with different names in python?

3 Answers3