5

I'm unable to plot the data for classification algo using numpy as it throws this error ValueError: x and y must be the same size

My data in the data variable look like this:

[[ 34.62365962  78.02469282   0.        ]
 [ 30.28671077  43.89499752   0.        ]
 [ 35.84740877  72.90219803   0.        ]
 [ 60.18259939  86.3085521    1.        ]
 [ 79.03273605  75.34437644   1.        ]
 [ 45.08327748  56.31637178   0.        ]
 [ 61.10666454  96.51142588   1.        ]
 [ 75.02474557  46.55401354   1.        ]]

Code:

data=np.loadtxt('ex2data1.txt',delimiter=',',dtype=None)
X = data[:, [0,1]]
y = data[:, 2]
pylab.scatter(X,y)
pylab.show()

I'm trying to plot this:

enter image description here

Jaskaran Singh Puri
  • 729
  • 2
  • 11
  • 37
  • Whenever you plot a point, you have to give it the `x` and `y` coordinate for that point. Currently you're trying to plot two `x` values per `y` value, but it doesn't know how to map them. With your current code, the easiest thing would be to duplicate the `y` values for the second row of `x` values and plot all of them that way. – alkasm Jun 17 '17 at 10:53

2 Answers2

10

The easiest way would be to unpack the data already while loading

import matplotlib.pyplot as plt

x,y,c = np.loadtxt('ex2data1.txt',delimiter=',', unpack=True)
plt.scatter(x,y,c=c)
plt.show()

Obviously you can do the unpacking also afterwards,

import matplotlib.pyplot as plt

data = np.loadtxt('ex2data1.txt',delimiter=',')
plt.scatter(data[:,0],data[:,1],c=data[:,2])
plt.show()
ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712
1
# Visualising the Training set results
from matplotlib.colors import ListedColormap
X_set, y_set = X_train, y_train
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
                     np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1, X2, cla.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
             alpha = 0.75, cmap = ListedColormap(('red', 'green')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
    plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],
                c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('KNN (Training set)')
plt.fig(figsize=(12,6))
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()

See the example

Ravi kumar
  • 61
  • 5
  • Hi and thanks for the answer. Its great that it works for you, but it would help us if you could explain what you did and how did you solve the initial problem! – Simas Joneliunas Feb 10 '22 at 05:48