1

I am interested to get an equally-sized group of indices. For example I have the following Python code:

import numpy as np

p = 2
n  = 10
labels = np.random.randint(p, size=n)

The code above will create 10 labels (0 and 1 values) but not necessarily that the total number of 0's is equal to the total number of 1's. What do I need is to automatically obtain the same number: for example 0 1 0 0 1 1 0 1 0 1 (where there are really five 0's and five 1's). The same concept of the example above can be generalized for any p and n such that n/p is an integer.

Any help will be very appreciated!

Christina
  • 903
  • 16
  • 39

3 Answers3

2

Just create a list with the equal number of zeroes and ones and then shuffle it using np.random.shuffle(arr)

https://numpy.org/doc/stable/reference/random/generated/numpy.random.shuffle.html

Kaushal Sharma
  • 1,770
  • 13
  • 16
  • And how do you create a list with the equal number of zeros and ones? – Christina Aug 03 '21 at 09:02
  • @Christina many ways. [0,0,0,0,0,1,1,1,1,1] :) or [0]*5 + [1]*5 – JeffUK Aug 03 '21 at 09:03
  • There can be many ways one of them can be to use `numpy.ones` and `numpy.zeros` to create them and then concatenate them. – Kaushal Sharma Aug 03 '21 at 09:04
  • @JeffUK great thank you. The second one is what I need. – Christina Aug 03 '21 at 09:04
  • 1
    @KaushalSharma I think the best one the second part of what JeffUK has written. – Christina Aug 03 '21 at 09:08
  • 1
    @Christina You are welcome. Yes indeed, it is pythonic. But just to mention that Numpy arrays are much faster than python list, you might want to read about it if you need them for large scale applications. https://stackoverflow.com/questions/8385602/why-are-numpy-arrays-so-fast – Kaushal Sharma Aug 03 '21 at 09:28
2

To answer the generic case.

Generate a list, then shuffle it using

labels = [int(x/(n/p)) for x in range(0,n)]
np.random.shuffle(labels)
JeffUK
  • 4,107
  • 2
  • 20
  • 34
1

You can create a list with zeros and ones and shuffle it inplace. You can use something like this

import numpy as np

p = 2
n  = 10
labels = (np.append(np.zeros(int(n/p)),np.ones(int(n/p))))
np.random.shuffle(labels)
alparslan mimaroğlu
  • 1,450
  • 1
  • 10
  • 20