0

I have a np.ndarray of length 398 and this is my train set, X_train. I am trying to make 10 different train sets,

X_train1
X_train2
X_train3

so on and so forth, by getting random rows from array X_train, all of equal length 40.

How can I do this?

Vadim Kotov
  • 8,084
  • 8
  • 48
  • 62
kyle54
  • 35
  • 4
  • Does this answer your question? [Numpy: Get random set of rows from 2D array](https://stackoverflow.com/questions/14262654/numpy-get-random-set-of-rows-from-2d-array) – Adam.Er8 Nov 28 '19 at 12:37

2 Answers2

0

Sample the data set over 10 iterations, and save the sample in a dictionary object with the name you want.

import numpy as np

data = {}
for i in range(10):
    name = 'X_train' + str(i+1)
    data[name] = np.random.choice(X_train, 40)
rich
  • 520
  • 6
  • 21
Shagun Sodhani
  • 3,535
  • 4
  • 30
  • 41
  • the thing is I need the output to be saved in 10 different variables (from X_train1 to X_train10) and they must also be ndarray I think so I can feed into a DecisionTreeClassifier – kyle54 Nov 28 '19 at 13:18
0

I think you have one of the below problem.

Problem 1 :- you have 2D array and wants to have few small X_trains out of it. check below code and output for this.

CODE

import numpy as np
X_train = np.random.randint(5, size=(10,3)) #### here you'll have your 2D array

count=1
X_train_change={}

number_of_xtrain=5 ### number of chunk of X_train
for i in range(number_of_xtrain):
    shp=np.random.randint(np.size(A,0))    
    #print(A[shp,:])
    X_train_change['X_Train%s'%count]=A[shp,:]
    count+=1

print(X_train_change)

Output

{'X_Train1': array([4, 1, 4]), 'X_Train2': array([3, 1, 1]), 'X_Train3': array([3, 1, 1]), 'X_Train4': array([4, 1, 4]), 'X_Train5': array([3, 4, 2])}

Problem 2 :- you have 1D array like this [1,2,3,4,5,6,7,8,9,10] and you want to have small xtrains like this X_Train1=[1,2,3,4,] , X_train2=[5,6,7,8,9,10]. check below code for this.

CODE

import numpy as np
X_train = np.random.randint(5, size=(400)) ### make sure to have size which can be splitted in number columns (length should be directly divided by number of columns you want)
print(X_train)
print(X_train.reshape(-1,40))

Output

    [[4 3 1 2 1 3 3 3 0 3 1 1 0 3 4 4 3 4 3 0 2 1 2 1 1 1 0 4 4 4 0 0 1 4 4 1
  1 1 1 4]
 [4 1 1 0 1 2 2 0 2 3 0 3 4 2 0 4 2 3 1 4 4 4 2 0 1 3 1 3 2 1 4 2 2 2 3 3
  1 1 4 4]
 [1 3 0 0 0 2 0 4 0 0 2 1 3 3 2 4 0 1 0 0 3 2 1 4 4 1 4 1 3 2 2 0 2 4 0 2
  3 4 4 4]
 [0 3 1 2 0 1 0 0 0 1 0 2 4 3 1 2 2 3 4 0 3 4 4 2 4 1 2 0 4 4 2 3 2 2 2 2
  4 0 3 3]
 [0 1 4 3 1 2 3 1 4 0 0 3 4 4 2 2 0 0 0 1 3 2 4 4 0 2 3 1 0 0 1 3 4 4 3 1
  1 0 0 2]
 [4 2 1 2 3 1 1 3 2 1 1 2 3 3 2 0 1 0 1 3 0 1 2 3 1 3 3 1 2 4 2 1 4 2 1 3
  3 4 3 4]
 [4 3 4 1 1 0 4 1 4 2 0 4 3 1 2 4 0 1 3 3 2 1 3 0 4 3 1 1 1 3 2 1 4 0 2 0
  0 4 3 2]
 [0 1 0 4 2 4 1 1 4 0 1 2 4 1 4 1 2 3 4 4 4 2 1 3 2 3 1 4 4 4 2 2 0 4 1 0
  0 0 4 2]
 [0 0 2 4 4 4 2 4 4 1 1 1 2 0 1 1 4 1 0 0 3 0 4 3 1 3 4 2 0 4 4 3 1 0 4 1
  0 3 0 1]
 [2 1 4 4 2 2 1 1 4 0 1 1 2 1 1 1 0 4 0 4 1 4 4 0 4 3 4 2 4 4 1 1 2 0 3 2
  3 2 1 1]]
lonewolf
  • 392
  • 3
  • 10
  • 1
    My X_train is a 2d array with values already in it (length 29) so I don't think I can use the np.randomint part :(, know of a way to do it with this? Thank you so much! – kyle54 Nov 28 '19 at 14:36
  • Are you talking about 2nd line of first code, if yes then you need to pass your array in X_train. I just created a 2d array with random numbers to do manipulation. – lonewolf Nov 28 '19 at 16:24