Here is my question.
With bunch of .csv files(or other files). Pandas is an easy way to read them and save into Dataframe
format. But when the amount of files was huge, I want to read the files with multiprocessing to save some time.
My early attempt
I manually divide the files into different path. Using severally:
os.chdir("./task_1")
files = os.listdir('.')
files.sort()
for file in files:
filename,extname = os.path.splitext(file)
if extname == '.csv':
f = pd.read_csv(file)
df = (f.VALUE.as_matrix()).reshape(75,90)
And then combine them.
How to run them with pool
to achieve my problem?
Any advice would be appreciated!