Here is the current code:
def load_data():
files = glob.glob('../manga-resized/sliced_images/*.png')
L = []
target_dist = []
i = 0
for fl in files:
image = color.rgb2lab(io.imread(fl))
L.append(image[:,:,:1])
ab = np.vstack(image[:,:,1:])
#print 'ab shape: ',ab.shape
#print 'KNN prediction shape: ',KNN.predict_proba(ab).shape
target_dist.append(KNN.predict_proba(ab))
i+=1
print i
print "finished creating L and target_dist"
X = np.asarray(L)
y = np.asarray(target_dist)
# remember to .transpose these later to 0,3,1,2
print 'X shape: ',X.shape,'y shape: ',y.shape
return X,y
currently I get the Killed: 9 message after i=391. My computer has 16GB of RAM, but I think I am somehow doing this really inefficiently. Eventually I hope to do this with near 1 million files let alone 400. I feel like this should be possible because I know people train with much larger than 400 file datasets. So how am I screwing this up? Is there some memory leak? I thought those couldn't happen in python. Any other reason for the Killed: 9 error?
thanks
edit: here is the result of ulimit -a
Alexs-MBP-6:manga-learn alex$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 256
pipe size (512 bytes, -p) 1
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 709
virtual memory (kbytes, -v) unlimited
here is the output with memory usage printed - after file 221.
https://bpaste.net/show/26109a193e43 . Clearly the available memory is decreasing but its still there by the time it gets the Killed : 9
Edit 2: I have seen in other places that np.asarray is very inefficient. Addiontally, when I take this part out of the formula, it does just fine and does not get killed. I have seen alternatives such as np.fromiter but those only cover 1D arrays - not the two 4 dimensional arrays that need to be returned here, X and y. Does anyone know the correct numpy way to fill these array?s