-2

Basically I am not going to post all of the code here but I will provide a generic example. I have a class that has a function to run and create a large array of values. This array shouldn't be much bigger than 10MB from my estimates. Within the functions it makes new and modifies arrays that should be collected up after the functions within train run. They are not used elsewhere besides the returned tempArray which is put into the large array. This is repeated. The memory used just keeps growing and growing. Is there an issue with my code or a way around this. I have read here about memory leaks with the malloc in Linux: http://pushingtheweb.com/2010/06/python-and-tcmalloc/.

J Spen
  • 2,614
  • 4
  • 26
  • 41
  • 1
    10MB? On a 32 bit platform, 1000*1000*10*20*4 (4 bytes per int) = 800MB. If your system is 64 bit, double that. – Thomas K May 11 '11 at 11:37
  • 3
    My mistake, numpy.zeros uses float64 by default. So 1.6GB on any platform. http://docs.scipy.org/doc/numpy/reference/generated/numpy.zeros.html – Thomas K May 11 '11 at 11:38
  • You can keep its size to 800MB using `zeros((1000,1000,10,20), dtype=float32)` – eumiro May 11 '11 at 11:43
  • Well, depending on what you're doing, you can keep it's size to as little as 200MB (`dtype=int8`). But 10MB it certainly isn't. – Thomas K May 11 '11 at 11:46
  • @Thomas K and @eumiro. The issue isn't with the size of the array. It is a memory allocation issue where the memory isn't freed. I'm not sure why it is occurring. It should actually be around 60 MB. I said 10 MB but that was with a different test set. Either way the memory just keeps rising typically causing a system crash because goes above 1.6Gb used for ipython. – J Spen May 11 '11 at 13:27
  • 1
    Please show a runable example. Maybe your estimates are off? – tillsten May 11 '11 at 13:49
  • @tillsten I can't release all my code as it is for a project but I've updated the above with a bit of it if it helps. It still doesn't have all the functions. It has the two outer functions. The inner functions perform operations on the array and return the array. They don't set any global variables of the class. – J Spen May 11 '11 at 14:11
  • 3
    If you can't share your code, create a runnable example (preferably short) that leads to the same problem as you. Retelling us what your code does won't help us find what's wrong with it. You could have a mistake, or there could be a bug, but without any code to reproduce the issue all we can say is "Well, it works fine for me". – Rosh Oxymoron May 11 '11 at 14:30
  • Will prepare something tomorrow that shows what happens. It is really late here now. – J Spen May 11 '11 at 15:46
  • @Rosh I wrote the code and posted it. I kept trying to get it posted here but the code wouldn't format right so I made a new post. I will see if I can get it here and then remove the other. If not, it's at [link](http://stackoverflow.com/questions/5975255/memory-allocated-to-python-in-the-os-is-never-released-back-in-linux-even-after-g). I'll remove this one shortly. This was rather worthless because I started to get to the bottom of the real issue I think and it's with the garbage collection on linux machines. – J Spen May 12 '11 at 08:28

1 Answers1

0

What are you trying to do?

temp = self.largeArray = zeros((1000,1000,10,20))
for y in temp.size:
    for x in temp1.size:
        self.largeArray[x,y] = train()

temp.size equals 200,000,000. How can you store anything into largeArray[x,y] if the second dimension of this array is only 1000?

eumiro
  • 207,213
  • 34
  • 299
  • 261
  • Sorry, that array I put without thinking the actual dimensions. The actual dimensions are a four dimensional array (67,12,300,30). – J Spen May 11 '11 at 13:15
  • Lets say a = temp.shape. Then temp.size should be replaced with a[0] (67) and temp1.size should be replaced with a[1] (12). Then train returns a numpy array that is (300,30). This is assigned to the portion of self.largeArray as shown above. Let me know if you need more clarification. I could send more code I just couldn't post all of it publicly available. – J Spen May 11 '11 at 13:20