0

I am using the following code to read all CSV files contained within a specific folder to a DataFrame.

import pandas as pd
import glob, os   

path = r'C:\myfiles\CSV'    

all_files = glob.glob(os.path.join(path, "*.csv"))     

df_from_each_file = (pd.read_csv(f) for f in all_files)

concatenated_df   = pd.concat(df_from_each_file, ignore_index=True)

The above code will read in all the CSV into the one DF, however I need to define the dtype of the column as object upon read, as the leading zero from a number of columns are missing. For example, column product code has to be set as object.

Below is a snippet of one of the CSV in question, there are over 20 CSV in total.

Time Period Product Number
2018_Q1     000123
2018_Q1     000567
2018_Q1     000345
2018_Q1     000853
2018_Q1     000147
2018_Q1     000963
2018_Q1     000852
2018_Q1     000120
2018_Q1     000100

Any help that anyone could provide would be greatly appreciated.

moe_95
  • 397
  • 2
  • 17

0 Answers0