1

I have a 2-D array that is called dataSet which is 335 x 225. I want to fill a new empty array, called newDataSet, 1-D array that is 1 x 7537. The newDataSet should be filled from the first 7537 data points of dataSet. How can I do this?

I want to use it a for loop like so:

for row in trainingData.shape[0]
    for column in trainingData.shape[1]
    bin_1 = trainingData[:33,:255]
Barmar
  • 741,623
  • 53
  • 500
  • 612
cool_beans
  • 131
  • 1
  • 5
  • 15
  • `7537` or `75375` ? – akilat90 Oct 15 '17 at 02:55
  • 7537. I want to make 10 new arrays that are made up of subsets of the original data. so the first 9 arrays (1-D) would have 7537 data and the last one would have 7542. – cool_beans Oct 15 '17 at 02:57
  • Possible duplicate of [From ND to 1D arrays](https://stackoverflow.com/questions/13730468/from-nd-to-1d-arrays) – Barmar Oct 15 '17 at 02:58
  • `trainingData.shape[i]` will always be an integer, which is not iterable. If you want to iterate *up to* that integer, use `range(trainingData.shape[i])` for whatever value of `i` – Zoey Hewll Oct 15 '17 at 03:32
  • but I am not getting this: do you want a single array or a list of arrays in the end? A single array could be troublesome because you need some alignment as the data does not fit perfectly. – norok2 Oct 15 '17 at 09:13

2 Answers2

0

You don't need for loops if you use numpy

import numpy as np    
dataset = np.random.random((335, 225))

First reshape the dataset in to a 1x(335*225) array and then extract the first 7537 elements out of it

new_array = dataset.reshape(1, 75375)[:,0:7537]
akilat90
  • 5,436
  • 7
  • 28
  • 42
  • Great Idea with the reshaping :) But also I dont want to reuse the data points from the original data. In other words, the first array should contain 1-7537, the second would contain 7536 - 15,073...etc. So would `new_array = dataset.reshape(1, 75375)[:,0:7537]` still work? – cool_beans Oct 15 '17 at 03:33
  • Use `arrays_1_to_9 = dataset.ravel()[0:7537*9].reshape((9, 7537))` and `last_array = dataset.ravel()[7537*9:]`. But there should be a more elegant way. – akilat90 Oct 15 '17 at 03:53
0

A one step method to generate your subsets :

subsets=np.split(a.ravel(),range(0,a.size,a.size//10)[1:-1])

In [30]: [len(s) for s in subsets]
Out[30]: [7537, 7537, 7537, 7537, 7537, 7537, 7537, 7537, 7537, 7542]

These are just views, so the data is not copied. It's normally not a problem for training sets.

B. M.
  • 18,243
  • 2
  • 35
  • 54