Deleting rows from several CSV files using Python

Question

I wanted to delete specific rows from every single csv. files in my directory (i.e. from row 0 to 33), but I have 224 separate csv. files which need to be done. I would be happy if you help me how can I use one code to carry out this.

you have to read all rows from file to memory, remove selected row and write all back to file. If you create function which does it for one filename then you can use `os.listdir()` to get names of all files in directory and use your function with every filename. — furas, Jul 07 '19 at 23:55

score 2 · Answer 1 · answered Jul 08 '19 at 00:11

I think you can use glob and pandas to do this quite easily, I'm not sure if you want to write over your original files something I never recommend, so be careful as this code will do that.

import os
import glob
import pandas as pd

os.chdir(r'yourdir')
allFiles = glob.glob("*.csv") # match your csvs
for file in allFiles:
   df = pd.read_csv(file)
   df = df.iloc[33:,] # read from row 34 onwards.
   df.to_csv(file)
   print(f"{file} has removed rows 0-33")

or something along those lines..

score 0 · Answer 2 · answered Jul 08 '19 at 00:02

This is a simple combination of two separate tasks.

First, you need to loop through all the csv files in a folder. See this StackOverflow answer for how to do that.

Next, within that loop, for each file, you need to modify the csv by removing rows. See this answer for how to read a csv, write a csv, and omit certain rows based on a condition.

One final aspect is that you want to omit certain line numbers. A good way to do this is with the enumerate function.

So code such as this will give you the line numbers.

import csv
input = open('first.csv', 'r')
output = open('first_edit.csv', 'w')
writer = csv.writer(output)
for i, row in enumerate(input):
    if i > 33:
        writer.writerow(row)
input.close()
output.close()

mohd4482 · Answer 3 · 2019-07-08T00:25:52.593

Iterate over CSV files and use Pandas to remove the top 34 rows of each file then save it to an output directory.

Try this code after installing pandas:

from pathlib import Path
import pandas as pd

source_dir = Path('path/to/source/directory')
output_dir = Path('path/to/output/directory')

for file in source_dir.glob('*.csv'):
    df = pd.read_csv(file)
    df.drop(df.head(34).index, inplace=True)
    df.to_csv(output_dir.joinpath(file.name), index=False)

Deleting rows from several CSV files using Python

3 Answers3