You should generate uniformly distributed angles phi
, and take the sqrt
of uniformly generated radius r
(which takes into account that we want to sample uniformly on the area, see explanation below), to make sure you sample points uniformly in the half-circle.
import numpy as np
import matplotlib.pyplot as plt
# sample
size = 10000
R = 1
phi = np.random.random(size=size) * np.pi
r = np.sqrt(np.random.random(size=size)) * R
# transform
x = r * np.cos(phi)
y = r * np.sin(phi)
# plot
f = plt.figure(figsize=(12,12))
a = f.add_subplot(111)
a.scatter(x, y, marker='.')
a.set_aspect('equal')
plt.show()

Explanation
To generate uniformly distributed points on a (half-)circle we have to make sure that each infinitesimal area or segment is "hit" with the same probability. We can simply sample phi
from a uniform random distribution [0, 1)
, multiplied by np.pi
(so [0, pi)
), since all angles should have the same probability to be sampled. But if we sample r
from a uniform random distribution in [0, 1)
, we generate too many points at small radia, and not enough at large radia, since the area grows like r**2
. To take that fact into account, we have to bias our sampled radia accordingly, and in this case we can apply the biasing by simply taking the square root (np.sqrt
), to apply the correct weighting to the sampled radius values, and take into account the larger area of the outer rings.
A much better and more thorough explanation is found here: https://stackoverflow.com/a/50746409/1170207
Performance comparison to rejection sampling methods
Since this method is basically an inversion sampling method, we compare its
performance to a rejection sampling algorithm.
import numpy as np
x, y = np.random.random(size=(2,10000))
%timeit r, phi = np.sqrt(x), y
# 19 µs ± 33.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit m = x**2 + y**2 <= 1; xx, yy = x[m], y[m]
# 81.5 µs ± 271 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
With the rejection sampling method we also cannot ensure that we draw a select number of variates, so we have to repeat the process until we have. This cannot be vectorized that nicely, unless we accept to sample too many values, and discard additional ones.