I am not familiar with python, but would like to run a function to read and write multiple files in parallel in python. For a minimal example:
from multiprocessing import Pool
import pandas as pd
def multiple(input_path, output_path, n):
df = pd.read_csv(input_path, index_col=0)
new_df = df.multiply(n)
new_df.to_csv(output_path)
workers = 6
input_filenames = [f'input_i.csv' for i in range(1,11)]
output_filenames = [f'output_i.csv' for i in range(1,11)]
with Pool(workers) as pool:
pool.map(multiple, ...)
If I am using for
loop, I can do this like below:
for i, input_file in enumerate(input_filenames):
input_path = input_filenames[i]
output_path = output_filenames[i]
multiple(input_path, output_path, 2)
How should I convert into pool.map
to match the index of each input and output filenames and also feed 3 arguments to the function (input_path
, output_path
, n
)?
Thank you!