0

I have multiple CSV files and want to count all row for each file without the title column. The result will display all file records count and total records count :

enter image description here

The following code will count all CSV file record with title columns

import glob
import pandas as pd
files = glob.glob('Folder/*.csv')
d = {f: sum(1 for line in open(f)) for f in files}
print (pd.Series(d))
print (pd.Series(d).rename('rows').rename_axis('filename').reset_index())

Questions: How to get the following results: Expect result

File1 3 Without Title column Row count

File2 3 Without Title column Row count

File3 2 Without Title column Row count

Total 8

Reference link: How to count rows in multiple csv file

Anson
  • 243
  • 1
  • 5
  • 20

2 Answers2

0

Pandas solution:

import os
import pandas as pd

path = "C:/Users/username"
files = [file for file in os.listdir(path) if file.endswith(".csv")]

output = pd.Series(name="Rows", dtype=int)
for file in files:
    df = pd.read_csv(os.path.join(path, file))
    output.at[file.replace(".csv", "")] = df.shape[0]
output.at["Total"] = output.sum()

>>> output
File1    3
File2    3
File3    2
Total    8
Name: Rows, dtype: int64

CSV solution:

import os
import csv

path = "C:/Users/username"
files = [file for file in os.listdir(path) if file.endswith(".csv")]

output = dict()
for file in files:
    with open(os.path.join(path, file)) as f:
        reader = csv.reader(f)
        output[file.replace(".csv", "")] = sum(1 for row in reader)-1
output["Total"] = sum(output.values())

>>> output
{'File1': 3, 'File2': 3, 'File3': 2, 'Total': 8}
not_speshal
  • 22,093
  • 2
  • 15
  • 30
  • Hi not_speshal. Do you add total for all file rows count? I get error message when run your code: ":7: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning. output = pd.Series(name="Rows")" – Anson Aug 25 '21 at 17:06
  • @Anson - Edited my answer to include the "Total" and silence the warning. – not_speshal Aug 25 '21 at 17:10
  • Hi Not_speshal, Your solution is good. Could you add print (output) in your code? I will accept your solution – Anson Aug 25 '21 at 17:46
  • Hi not_speshal, Can I export the result to excel file? Thanks – Anson Aug 25 '21 at 19:28
  • For the pandas Series, you can use: `output.to_csv("output.csv")` – not_speshal Aug 25 '21 at 19:32
0

You can try :

import glob

files = glob.glob('Folder/*.csv')
d = {}
total = 0
for f in files:
    with open(f, 'r') as out:
        nb_row = len(out.readlines())
        d[f.replace(".csv", "")]= nb_row
        total += nb_row
print(d)
print("Total", total)
Clément Perroud
  • 483
  • 3
  • 12
  • Hi Clément Perroud. I get this error"UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 3823: character maps to . We may need to change encoding? " – Anson Aug 25 '21 at 19:56