I am trying to load a data frame from a large csv file as shown below. Currently, this line fails due to an out of memory error. I would like to use the multiprocessing package (from multiprocessing.pool import ThreadPool) when loading from this csv file. Here is what I am trying to run using Multiprocessing:
source_data_df = pd.read_csv(temp_file, skipinitialspace=True, dtype=str, na_values=['N.A.'])
Could someone show how this line and the additional code would look when running with Multiprocessing?
Big thank you!
Michael