I have a dataset 1.4 million samples x 32 features.
I want to convert each sample to concatenate array of earlier 1000 samples plus itself. Since I don't have the earlier data for the first 1000 samples, I remove them. Thus, each sample has 1001*32 features after conversion. I use the code below but it crashes everytime, even on my 12GB RAM laptop. What am I doing wrong here. How can I make this computation feasible?
def take_previous_data(X_train,y):
temp_train_data=X_train[1000:]
temp_labels=y[1000:]
final_train_set=[]
for index,row in enumerate(temp_train_data):
actual_index=index+1000
final_train_set.append(X_train[actual_index-1000:actual_index+1].flatten())
return np.array(final_train_set),temp_labels
Note: Using Python 2.7