I have a simple multiprocessing task for writing to a csv file. Program takes around 40k rows from another file, processes those data and writes them to another file. My code looks like this:
create_queue_infile(csv_file, q, opt)
pool = multiprocessing.Pool(processes=(multiprocessing.cpu_count() - 1))
while not (q.empty()):
res = pool.apply_async(my_function, args=(q.get(), input2, 5, output,))
pool.close()
pool.join()
And the part to write to another csv file looks like this:
def write_to_csv(path, csv_row):
with open(path, 'a', newline='', encoding="utf-8") as f:
f.write(csv_row)
This works flawlessly on my Linux machine on any scale. However when I want to run this program on a Windows machine, my output file seems to have some corrupted lines. It almost seems like processes overlap while trying to write to the same file. A sample output looks like this:
ROW SOME_INFORMATION
ROW SOME_INFORMATION
ROW SOME_INFORMATION
SOM
ROW SOME_INFORMATION
ROW SOME_INFORMATION
ROW SOME_INFORMATION
FORMATION
ROW SOME_INFORMATION
ROW SOME_INFORMATION
Since I have a lot of data, it is impractical to keep track of every row, so I am trying to figure out the reason behind this problem. My actual worry is why this works on Linux but not on Windows.