0

I am using GuassianNB of Sci-kit learn for my classification.After fitting my data and on predicting ,it throws memory error.

clf1=GaussianNB() 
clf1.fit(X_train,y_train)
y_pred1=clf1.predict(imgarray)

where:

  1. X_train is an array of size(1413,2)
  2. y_train is an array of size (1413,)
  3. imgarray size is (9000000,2)

Error :

Error thrown

Other Details:

SCi-Kit learn version: 0.15, Windows 7 32 bit, Python 2.7, pydev,RAM 4 GB

I have tried to change the version and other stuffs but problem continues. Is my imgarray too big?.I shall be thankful for the help and advises.

Piyush
  • 388
  • 1
  • 6
  • 21
  • Is your `imgarray` size is `9000000`? – badc0re Nov 25 '14 at 09:08
  • @badc0re ...yes 9000000 rows with 2 columns consisting of R and G band of image pixels – Piyush Nov 25 '14 at 09:26
  • Well i think it is a lot for 4gb machine, imagine if you have 10.000 images (which is not a lot) how much memory will it require? I think it is good to see how to use some image processing techniques in order to reduce your vector size. – badc0re Nov 25 '14 at 09:37

1 Answers1

0

I don't think imgarray on its own is enough to crash a 4GB machine:

In [3]: a = np.zeros((9000000,2))
In [4]: a.nbytes
Out[4]: 144000000

which is about 137MB. Are you holding any other large arrays is memory? It's hard to tell without looking at your code. Could you post a complete an runnable piece of code so we can have a closer look?

Also, you can have a look at this question to learn how to do memory profiling.

Community
  • 1
  • 1
mbatchkarov
  • 15,487
  • 9
  • 60
  • 79