0

I am learning neural networks and I am trying to automate some of the processes. Right now, I have code to split the dataset randomly, a 284807x31 piece. Then, I need to separate inputs and outputs, meaning I need to select the entire array until the last column and, on the other hand, select only the last column. For some reason I can't figure out how to do this properly and I am stuck at splitting and separating the set as explained above. Here's my code so far (the part that refers to this specific problem):

train, test, cv = np.vsplit(data[np.random.permutation(data.shape[0])], (6,8))

# Should select entire array except the last column
train_inputs = np.resize(train, len(train[:,1]), -1)
test_inputs = np.resize(test, len(test[:,1]), -1)
cv_inputs = np.resize(cv, len(cv[:,1]), -1)

# Should select **only** the last column.
train_outs = train[:, 30]
test_outs = test[:, 30]
cv_outs = test[:, 30]

The idea is that I'd like the machine to find the column number of the corresponding dataset and do intended resizes. The second part will select only the last column - and I am not sure if that works because the script stops before that. The error is, by the way:

Traceback (most recent call last):
  File "src/model.py", line 43, in <module>
    train_inputs = np.resize(train, len(train[:,1]), -1)
TypeError: resize() takes exactly 2 arguments (3 given)

PS: Now that I am looking at the documentation, I can see I am very far from the solution but I really can't figure it out. It's the first time I am using NumPy.

Thanks in advance.

Gustavo Silva
  • 159
  • 1
  • 12
  • `np.resize` is a rarely used function. `reshape` is more useful (as in http://stackoverflow.com/questions/41795638/collapsing-all-dimensions-of-numpy-array-except-the-first-two). And the index slicing as shown in the answer is very common. Maybe you are used to using `resize` in `VBA` or other languages. – hpaulj Jan 22 '17 at 23:51
  • I see. Thanks! I am not used to any language. Just a guy trying to learn new skills and I am searching for problems as I go. I found that solution to one of the problems but apparently was not a good solution :) – Gustavo Silva Jan 23 '17 at 00:08

1 Answers1

2

Some slicing should help:

Should select entire array except the last column

train_inputs = train[:,:-1]
test_inputs = test[:,:-1]
cv_inputs = cv[:,:-1]

and:

Should select only the last column.

train_outs = train[:,-1]
test_outs = test[:, -1]
cv_outs = test[:, -1]
Mike Müller
  • 82,630
  • 20
  • 166
  • 161
  • I think this solved the issue, even though I am getting another issue but that is related with other parts of the code. Thanks! – Gustavo Silva Jan 22 '17 at 23:29