0

I have two numpy arrays. One numpy array is 2D of shape (200,x) where x can be positive integer. And another array is 1D array of shape (x,) where x is same as the 2D numpy array. I want to randomly drop columns in both arrays in unison if the second dimension of 2D array is greater than 1000 to make it of size (200,1000) and the corresponding 1D array will also be (1000,). I know I can use np.delete but I don't how to ensure that columns are randomly dropped such that second dimension of 2D array is 1000 and the length of 1D array is also 1000. Insights will be appreciated.

John
  • 815
  • 11
  • 31

1 Answers1

1

All you need to do is decide which columns to keep/drop first. Then keep/drop those columns from both arrays. You have x columns. You want to select any 1000 of these randomly.

From Generate 'n' unique random numbers within a range, you can create a list of column indices in the range [0, x).

import random

sel_cols = random.sample(range(x), 1000)

Next, you can select these columns from the numpy arrays:

downsized_matrix = original_matrix[:, sel_cols]

downsized_vector = original_vector[sel_cols]
Pranav Hosangadi
  • 23,755
  • 7
  • 44
  • 70