I am having trouble generating this data set for my dissertation from the following distribution.
My attempt results in this data set which looks more independent. I cannot seem to spot where I am going wrong. Could somebody help me out?
Here is the code:
# Non-linear dependence without correlation
import numpy as np
import matplotlib.pyplot as plt
x = np.random.uniform(-0.5, 0.5, 500)
def y_samples(x):
y = []
for i in x:
if np.abs(i) <= 1/6:
y.append(np.random.normal(0, 1/9))
else:
y.append(0.5 * np.random.normal(1, 1/9) + 0.5 * np.random.normal(-1, 1/9))
return y
y = y_samples(x)
plt.scatter(x, y)
plt.xlabel('x')
plt.ylabel('y')
plt.show()
Thanks!