0

I am trying to load a file from https://data.medicare.gov/Physician-Compare/Physician-Compare-National-Downloadable-File/mj5m-pzi6 into a pandas df. My computer has 16gb ram while this file is less than 800mb as a csv.

I am able to load 1 million of the 2.6 million rows, but I cannot load the whole csv. When I try and load the whole csv, I get a MemoryError:. However, when I look at my memory being used in task manager, it doesn't go over 34% (which means I should have about 10 GB available), but nevertheless I continue to get an error.

I am able to run the code below when I add nrows=1000000

df = pd.read_csv('Physician_Compare_National_Downloadable_File.csv',encoding="utf-8",
             engine='python')

I know pandas needs about 10X the ram compared to the size of the file, so I don't get how 16GB is not enough. Is there something I need to do to optimize my PC or python to use the whole 16GB?

CandleWax
  • 2,159
  • 2
  • 28
  • 46

0 Answers0