I am using the following code to read all CSV files contained within a specific folder to a DataFrame.
import pandas as pd
import glob, os
path = r'C:\myfiles\CSV'
all_files = glob.glob(os.path.join(path, "*.csv"))
df_from_each_file = (pd.read_csv(f) for f in all_files)
concatenated_df = pd.concat(df_from_each_file, ignore_index=True)
The above code will read in all the CSV into the one DF, however I need to define the dtype of the column as object
upon read, as the leading zero from a number of columns are missing. For example, column product code
has to be set as object.
Below is a snippet of one of the CSV in question, there are over 20 CSV in total.
Time Period Product Number
2018_Q1 000123
2018_Q1 000567
2018_Q1 000345
2018_Q1 000853
2018_Q1 000147
2018_Q1 000963
2018_Q1 000852
2018_Q1 000120
2018_Q1 000100
Any help that anyone could provide would be greatly appreciated.