How do use python to iterate through a directory and delete specific columns from all csvs?

Question

I have a directory with several csvs.

files = glob('C:/Users/jj/Desktop/Bulk_Wav/*.csv')

Each csv has the same below columns. Reprex below-

yes no maybe ofcourse
1   2  3     4

I want my script to iterate through all csvs in the folder and delete the columns maybe and ofcourse.

you can try pandas using `df.drop(['maybe ', 'ofcourse'], axis=1)`, check here https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.drop.html — Epsi95, Jul 16 '21 at 04:06
Does this answer your question? [How can I iterate over files in a given directory?](https://stackoverflow.com/questions/10377998/how-can-i-iterate-over-files-in-a-given-directory) or [Iterating-through-directories-with-python](https://stackoverflow.com/questions/19587118/iterating-through-directories-with-python) or [loop-through-all-csv-files-in-a-folder](https://stackoverflow.com/questions/14262405/loop-through-all-csv-files-in-a-folder) or [import-multiple-csv-files](https://stackoverflow.com/questions/20906474/import-multiple-csv-files-into-pandas-and-concatenate-into-one-dataframe) — Anurag Dabas, Jul 16 '21 at 04:07
Are they literally CSV files (that is, COMMA separated), or do they look like the text above? For CSVs, you don't need Python for this. — Tim Roberts, Jul 16 '21 at 04:07

score 1 · Accepted Answer · answered Jul 16 '21 at 04:14

If glob provides you with file paths, you can do the following with pandas:

import pandas as pd

files = glob('C:/Users/jj/Desktop/Bulk_Wav/*.csv')
drop = ['maybe ', 'ofcourse']

for file in files:
    df = pd.read_csv(file)
    for col in drop:
        if col in df:
            df = df.drop(col, axis=1)
    df.to_csv(file)

Alternatively if you want a cleaner way to not get KeyErrors from drop you can do this:

import pandas as pd

files = glob('C:/Users/jj/Desktop/Bulk_Wav/*.csv')
drop = ['maybe ', 'ofcourse']

for file in files:
    df = pd.read_csv(file)
    df = df.drop([c for c in drop if c in df], axis=1)
    df.to_csv(file)

score 0 · Answer 2 · answered Jul 16 '21 at 04:10

Do you mean by:

files = glob('C:/Users/jj/Desktop/Bulk_Wav/*.csv')
for filename in files:
    df = pd.read_csv(filename)
    df = df.drop(['maybe ', 'ofcourse'], axis=1)
    df.to_csv(filename)

This code will remove the maybe and ofcourse columns and save it back to the csv.

score 0 · Answer 3 · answered Jul 16 '21 at 04:18

0

You can use panda to read csv file to a dataframe then use drop() to drop specific columns. something like below:

df = pd.read_csv(csv_filename)
df.drop(['maybe', 'ofcourse'], axis=1)

answered Jul 16 '21 at 04:18

Shuduo

727
5
14

score 0 · Answer 4 · answered Jul 16 '21 at 04:22

import pandas as pd
from glob import glob

files = glob(r'C:/Users/jj/Desktop/Bulk_Wav/*.csv')
for filename in files:
    df = pd.read_csv(filename, sep='\t')
    df.drop(['maybe', 'ofcourse'], axis=1, inplace=True)
    df.to_csv(filename, sep='\t', index=False)

If the files look exactly like what you have there, then maybe something like this

How do use python to iterate through a directory and delete specific columns from all csvs?

4 Answers4