0

I had used gensim and Word Mover's distance to generate a distance matrix for my text of ~4000. The first time I ran it, it took about a day or so. I am trying to run it again and it is taking more than 4 days so far. Another function that runs KMedoids usually takes minutes but now takes days. I have tried using garbage collector to clear up some memory but that hasn't solved anything.

import gc
gc.collect()

I've tried restarting my computer but that also didn't help. I'm not sure what to do to fix this.

mjoy
  • 606
  • 2
  • 9
  • 19
  • 1
    Is your RAM utilization high? Maybe you've gone into swap space/paging? – Random Davis May 12 '22 at 15:30
  • According to Activity Monitor Swap = 0 bytes – mjoy May 12 '22 at 16:24
  • What about system resources in general? What's the bottleneck? – Random Davis May 12 '22 at 16:25
  • Python is using 1.5 GB (the highest memory used) and there is 16GB on here. Memory = 9.10 GB, Cached Files = 6.72 GB – mjoy May 12 '22 at 16:27
  • I mean not just RAM. Is disk usage at 100%? CPU? GPU? etc – Random Davis May 12 '22 at 16:47
  • Python is using 99% of CPU. System = 2.6%. User = 11.36% – mjoy May 12 '22 at 16:49
  • So, you now know that the bottleneck is the CPU. As for why that started happening, now you'll have to troubleshoot. Have you tried to exactly and precisely replicate the circumstances that happened when the process took less time? Have you tried to [profile](https://stackoverflow.com/questions/582336/how-do-i-profile-a-python-script) the script to see where the "hot spots" are? – Random Davis May 12 '22 at 16:57

0 Answers0