8

It sounds like easy not i dont know how to do.

i have numpy 2d array of

X = (1783,30)

and i want to split them in batches of 64. I write the code like this.

batches = abs(len(X) / BATCH_SIZE ) + 1  // It gives 28

I am trying to do prediction of results batchwise. So i fill the batch with zeros and i overwrite them with predicted results.

predicted = []

for b in xrange(batches): 

 data4D = np.zeros([BATCH_SIZE,1,96,96]) #create 4D array, first value is batch_size, last number of inputs
 data4DL = np.zeros([BATCH_SIZE,1,1,1]) # need to create 4D array as output, first value is  batch_size, last number of outputs
 data4D[0:BATCH_SIZE,:] = X[b*BATCH_SIZE:b*BATCH_SIZE+BATCH_SIZE,:] # fill value of input xtrain

 #predict
 #print [(k, v[0].data.shape) for k, v in net.params.items()]
 net.set_input_arrays(data4D.astype(np.float32),data4DL.astype(np.float32))
 pred = net.forward()
 print 'batch ', b
 predicted.append(pred['ip1'])

print 'Total in Batches ', data4D.shape, batches
print 'Final Output: ', predicted

But in the last batch number 28, there are only 55 elements instead of 64 (total elements 1783), and it gives

ValueError: could not broadcast input array from shape (55,1,96,96) into shape (64,1,96,96)

What is the fix for this?

PS: the network predictione requires exact batch size is 64 to predict.

pbu
  • 2,982
  • 8
  • 44
  • 68
  • Your question is unclear to me (and considering the amount of views and no answers, I'm not the only one). 1). Which module does `net` come from? 2) You have a 2D array X. You want to process rows 0:64, then 64:2*64, then 2*64:3*64 and so on. And you know 1783 isn't a multiple of 64? That's where the error is coming from in any case. Try to be more explicit about what you want, possibly reducing yourself to a simpler example of say 5x4. – Oliver W. Feb 13 '15 at 22:15

4 Answers4

15

I don't really understand your question either, especially what X looks like. If you want to create sub-groups of equal size of your array, try this:

def group_list(l, group_size):
    """
    :param l:           list
    :param group_size:  size of each group
    :return:            Yields successive group-sized lists from l.
    """
    for i in xrange(0, len(l), group_size):
        yield l[i:i+group_size]
poli_g
  • 629
  • 3
  • 15
  • The network can do predictions only with data of 64 batches. So it needs dummy data as long as batch size is 64 – pbu Feb 15 '15 at 19:05
1

I found a SIMPLE way of solving the batches problem by generating dummy and then filling up with the necessary data.

data = np.zeros(batches*BATCH_SIZE,1,96,96)
// gives dummy  28*64,1,96,96

This code will load the data exactly 64 batch size. The last batch will have dummy zeros at the end, but thats ok :)

pred = []
for b in batches:
 data4D[0:BATCH_SIZE,:] = data[b*BATCH_SIZE:b*BATCH_SIZE+BATCH_SIZE,:]
 pred = net.predict(data4D)
 pred.append(pred)

output =  pred[:1783] // first 1783 slice

Finally i slice out the 1783 elements from 28*64 total. This worked for me but i am sure there are many ways.

pbu
  • 2,982
  • 8
  • 44
  • 68
0

This can be achieved using as_strided of numpy.

from numpy.lib.stride_tricks import as_strided
def batch_data(test, batch_size):
    m,n = test.shape
    S = test.itemsize
    if not batch_size:
        batch_size = m
    count_batches = m//batch_size
    # Batches which can be covered fully
    test_batches = as_strided(test, shape=(count_batches, batch_size, n), strides=(batch_size*n*S,n*S,S)).copy()
    covered = count_batches*batch_size
    if covered < m:
        rest = test[covered:,:]
        rm, rn = rest.shape
        mismatch = batch_size - rm
        last_batch = np.vstack((rest,np.zeros((mismatch,rn)))).reshape(1,-1,n)
        return np.vstack((test_batches,last_batch))
    return test_batches
MSS
  • 3,306
  • 1
  • 19
  • 50
-2

data4D[0:BATCH_SIZE,:] should be data4D[b*BATCH_SIZE:b*BATCH_SIZE+BATCH_SIZE, :].

Peter
  • 12,274
  • 9
  • 71
  • 86
  • Can you explain your answer please ? – Zulu Feb 15 '15 at 20:14
  • That wont work, the network will take 4d data exactly batch size of 64. If the input array is unequal, the network model will throw error. – pbu Feb 15 '15 at 23:54