0

We are appending data in bigquery table from all the CSV files available on google drive. below is the code which is working fine for a single file (trainers.csv).

Need help in running all the files in a single go. How can i read all the available CSV files from google drive & save it to pandas dataframe & run my complete process in loop?

from google.colab import drive
drive.mount('/content/drive')

my_data = pd.read_csv('/content/drive/MyDrive/Vestiaire_data/july-2022/trainers.csv',encoding = 'ISO-8859-1',low_memory=False)

my_data.to_gbq(-------------)
pythonlearner
  • 85
  • 2
  • 10
  • You can get all the file names in a list and iterate over them in a loop. Please refer to this https://stackoverflow.com/questions/20906474/import-multiple-csv-files-into-pandas-and-concatenate-into-one-dataframe – jen Aug 07 '22 at 13:55

1 Answers1

2

It is easy. Just get all the files from the directory using OS module and iterate all the CSV files and append all in one dataframe like this.

import pandas as pd
import os

directory_path = '/content/drive/MyDrive/Vestiaire_data/july-2022'
directory_files = os.listdir(directory_path)

df = pd.Dataframe()
for file in directory_files:
    df_file = pd.read_csv(os.path.join(directory_path, file), encoding = 'ISO-8859-1',low_memory=False)
    df = pd.concat([df, df_file])

# Then save the df here

Just make sure that all the CSV's have the same structure

SenthurLP
  • 124
  • 1
  • 7