0

Consider I have 1000 .CSV files with names of my employees. So there isn't any order or numbers in file names. Is there a way to say to the computer in Python language, that read files from first to end in a special folder, whit no matter what their name is? (It's not important for me the data is for who, I only need to grab these data for analyzing).

Hasani
  • 3,543
  • 14
  • 65
  • 125

3 Answers3

4

You can read all csv files in a directory like this:

My csv:

col1,col2,col3
a,b,c
d,e,f

Code:

import glob
import csv

PATH = "/Users/stack/"

for file in glob.glob(PATH+"*.csv"):
    with open(file) as csvfile:
        spamreader = csv.reader(csvfile, delimiter=',')
        for row in spamreader:
            print(" ".join(row))

Output:

col1 col2 col3
a b c
d e f

Process finished with exit code 0
madik_atma
  • 787
  • 10
  • 28
2

use code like:(replace current path (.) with your path:

import os, fnmatch
import csv
listOfFiles = os.listdir('.')  
pattern = "*.csv"  
for entry in listOfFiles:  
    if fnmatch.fnmatch(entry, pattern):
        with open(entry, newline='') as csvfile:
            spamreader = csv.reader(csvfile)
            for line in spamreader:
                print(line)
##########Using Danadas package
import os, fnmatch
import pandas as pd

listOfFiles = os.listdir('.')  
pattern = "*.csv"  
for entry in listOfFiles:  
    if fnmatch.fnmatch(entry, pattern):
        read_File_as_DF=pd.read_csv(entry)
        print(read_File_as_DF)
Pradeep Pandey
  • 307
  • 2
  • 7
  • It seems your code read `.txt` files not `.csv` files. – Hasani Feb 14 '19 at 11:02
  • change the .txt it will work for csv as well – Pradeep Pandey Feb 14 '19 at 11:17
  • Great, it worked! Can you write also a pandas version? I think it's simpler to understand and work for beginners on Python! – Hasani Feb 14 '19 at 11:29
  • One line of outputs that I get with your code is like this `['13971115', '1020002.00', '1020002', '1020002', '1020002.00', '1020002.00', '1021098', '130', '1'] ` . How can I delete ` ' ' ` quotes? Also save this read data as a matrix? (I have `numpy` for matrix working). – Hasani Feb 14 '19 at 11:36
  • I like to save the read data as a matrix with rows like this `[13971115, 1020002.00, 1020002, 1020002, 1020002.00, 1020002.00, 1021098, 130, 1]` with no quotes. – Hasani Feb 14 '19 at 11:38
  • using pandas import os, fnmatch import pandas as pd listOfFiles = os.listdir('.') pattern = "*.csv" for entry in listOfFiles: if fnmatch.fnmatch(entry, pattern): read_File_as_DF=pd.read_csv(entry) print(read_File_as_DF) – Pradeep Pandey Feb 14 '19 at 11:55
  • 1
    use pandas version it will be resolved, just use separator rightly and qua toes if have in read_csv function – Pradeep Pandey Feb 14 '19 at 11:57
1

Yes you can. I would use a simple regex based tester to check the file and so essentially what you're doing is you are using a for loop to go through the directory and using the if-statement, we test the file to see if it contains '.csv'. After this we open the file and we simply append it to our output which you can either choose to analyse or store as a file. I've commented out the option of outputting to a file, however if you wish to you could.

import re

# Redefine this to the path of your folder:
folderPath = "SET UNIX PATH HERE"

output = None
for file in os.listdir(folderPath):
    if re.search(r'.csv', file):
        with open(file, r) as readFile:
            output += readFile.read()

# Uncomment this part if you would like to store the output to a file
# Define the path to the file that will be created:
# outputFilePath = "SET UNIX PATH"
# with open(outputFilePath, w+) as outputFile:
#     outputFile.write(output)

Hope this helps :)

aaaakshat
  • 809
  • 10
  • 19