0

I am trying to filter 800 hundred csv files. but gives no entry in first run. the script filters data after having 3 to 4 runing it not on the first time without any error.

import pandas as pd 
import os
from datetime import datetime


def filter_data(filenames, search_keyword):
    df_resultant = pd.DataFrame([])
    for filename in filenames:
        col_names = ["col1", "col2", "col3", "col4", "col5"]
        # excel_file_df = pd.read_csv(filename, index_col=False, nrows=0)
        # excel_file_df = pd.read_csv(filename, low_memory=False, names=col_names, dtype={'col1': str,'col4': str})
        # excel_file_df = pd.read_csv(
        #     filename, low_memory=False, names=col_names)
        excel_file_df = pd.read_csv(
            filename, dtype='unicode', names=col_names)
            

        df = excel_file_df[excel_file_df['col3'].str.contains(
            search_keyword, na=False, case=False)]

        if not df.empty:
            df_copy = df.copy()
            df_copy.loc[:, 'file_name'] = os.path.basename(filename)
            df_copy.columns = [''] * len(df_copy.columns)
            df_resultant = pd.concat([df_resultant, df_copy])

    df_resultant.to_csv(
        './output/' + str(datetime.now().timestamp())+'.csv', index=False)
eshirvana
  • 23,227
  • 3
  • 22
  • 38
  • What is the error? Please include it in your post – Yannis P. May 30 '22 at 19:12
  • 1
    [Never call `DataFrame.append` or `pd.concat` inside a for-loop. It leads to quadratic copying.](https://stackoverflow.com/a/36489724/1422451) – Parfait May 30 '22 at 20:05
  • Thanks for your help Sir. I am neither receiving any error, nor getting any data in the output .csv file. It starts working when I run it 3 to 4 times or some time more than 5 to 6 times, but doesn't work on first time run. – Saim Aqeel May 31 '22 at 07:44

0 Answers0