I'm workin with fairly large dataframes and textfiles (thousands of docs) that I am opening up in my ipython notebook. I'm noticing that after a while, my computer becomes really slow. Is there a way to take inventory of my python program to find out what's slowing down my computer?
-
1When there is no memory space to store your data, memory swapping which move memory data to HDD will start. That's why your computer slow down. To avoid this, upgrade your memory. In this case, you should delete object which you don't need using `del variable`. – Kei Minagawa Mar 18 '14 at 05:33
2 Answers
You have a few options. First, you can use third party tools like heapy or PySizer to evaluate your memory usage at different points in your program. This (now closed) SO question discusses them a little bit. Additionally, there is a third option simply called 'memory_profiler' hosted here on GitHub, and according to this blog there are some special shortcuts in IPython for memory_profiler.
Once you have identified the data structures that are consuming the most memory, there are a few options:
Refactor to take advantage of garbage collection
Examine the flow of data through your program and see if there are any places where large data structures are kept around when they don't need to be. If you have a large data structure that you do some processing on, put that processing in a function and returned the processed result so the original memory hog can go out of scope and be destroyed.
A comment suggested using thedel
statement. Although the commenter is correct that it will free memory, it really should indicate to you that your program isn't structured correctly. Python has good garbage collection, and if you find yourself manually messing with memory freeing, you should probably put that section of code in a function or method instead, and let the garbage collector do its thing.Temporary Files
If you really need access to large data structures (almost) simultaneously, consider writing one or several of them to temporary files while not needed. You can use the JSON or Pickle libraries to write stuff out in sophisticated formats, or simply pprint your data to a file and read it back in later.
I know that seems like some kind of manual hard disk thrashing, but it gives you great control over exactly when the writes to and reads from the hard disk occur. Also, in this case only your files are bouncing on and off the disk. When you use up your memory and swapping starts occurring, everything gets bounced around - data files, program instructions, memory page tables, etc... Everything grinds to a halt instead of just your program running a little more slowly.- Buy More Memory
Yes, this is an option. But like thedel
statement, it can usually be avoided by more careful data abstraction and should be a last resort, reserved for special cases.

- 1
- 1

- 9,358
- 11
- 54
- 84
iPython it's a wonderful tool, but sometimes it tends to slow things up.
If you have large print
output statements, lots of graphics, or your code has grown too big, the autosave
takes forever to snap your Notebooks. Try autosaving sparingly with:
%autosave 300
Or disabling it entirely:
%autosave 0

- 7,223
- 3
- 28
- 41