Read all CSV file to DataFrame - column dtype

Asked Sep 24 '19 at 08:29

Active Sep 24 '19 at 08:33

Viewed 28 times

I am using the following code to read all CSV files contained within a specific folder to a DataFrame.

import pandas as pd
import glob, os   

path = r'C:\myfiles\CSV'    

all_files = glob.glob(os.path.join(path, "*.csv"))     

df_from_each_file = (pd.read_csv(f) for f in all_files)

concatenated_df   = pd.concat(df_from_each_file, ignore_index=True)

The above code will read in all the CSV into the one DF, however I need to define the dtype of the column as object upon read, as the leading zero from a number of columns are missing. For example, column product code has to be set as object.

Below is a snippet of one of the CSV in question, there are over 20 CSV in total.

Time Period Product Number
2018_Q1     000123
2018_Q1     000567
2018_Q1     000345
2018_Q1     000853
2018_Q1     000147
2018_Q1     000963
2018_Q1     000852
2018_Q1     000120
2018_Q1     000100

Any help that anyone could provide would be greatly appreciated.

asked Sep 24 '19 at 08:29

moe_95

1

Change `pd.read_csv(f)` to `pd.read_csv(f, dtype={'Product Number':str})` – jezrael Sep 24 '19 at 08:32
1

Worked perfectly, you're a legend. – moe_95 Sep 24 '19 at 08:34

Read all CSV file to DataFrame - column dtype

0 Answers0