0

I have been trying to debug my code for a while, and i need help in trying to plot a scatterplot. When i tried to plot it, it gave me an error stated:

ValueError: 'c' argument has 2 elements, which is not acceptable for use with 'x' with size 48, 'y' with size 48.

The dataset: https://data.gov.sg/dataset/monthly-revalidation-of-coe-of-existing-vehicles?view_id=b228d20d-5771-48ec-9d7b-bb52351c0f7d&resource_id=e62a59fd-ee9f-43ec-ac69-58dc5c8045be

My code:

import numpy as np        #importing numpy as np declaring as np
import matplotlib.pyplot as plt   #importing matplotlib pyplot as plt

title = "COE revalidation"     #title of the output
titlelen = len(title)   
print("{:*^{titlelen}}".format(title, titlelen=titlelen+6))
print()

recoe = np.genfromtxt("data/annual-revalidation-of-certificate-of-entitlement-coe-of-existing-vehicles.csv",  #loading dataset, storing it as recoe
                      dtype=(int,"U12","U18",int),
                      delimiter=",",
                      names=True)
years = np.unique(recoe["year"])     #extracting unique values from year column, storing it as years
type = np.unique(recoe["type"])      #extracting unique values from type column, storing it as type
category = np.unique(recoe["category"])  #extracting unique values from category column, storing it as category
category5 = recoe[recoe["type"]=="5 Year"]   #extracting coe 5 year, storing it as category5
category10 = recoe[recoe["type"]=="10 Year"]  #extracting coe 10 year, storing it as category10

category5numbers = category5["number"]   #extracting 'number' from category5 and storing it as category5numbers   (number of revalidation , 5 years)
category10numbers = category10["number"]    #extracting 'number' from category10 and storing it as category5numbers   (number of revalidation , 10 years)
    colours =['tab:blue', 'tab:orange'] 

plt.figure(figsize=(7, 6))
plt.scatter(category5numbers,category10numbers,c= colours ,linewidth=1,alpha=0.75,edgecolor='black',s=200)
plt.title("Scatter Plot of category5 versus category10")
plt.xlabel("number of category 5 revalidation")
plt.ylabel("number of category 10 revalidation")
plt.tight_layout()

plt.show()
onofricamila
  • 930
  • 1
  • 11
  • 20
coder121
  • 21
  • 1
  • 5

2 Answers2

0

If I understand correctly, you are trying to make a scatter plot using two variables but a single factor. You can't do this. You can, however, pass a list of colors if you can divide your data into multiple factors. The Category column in your data set can be used to separate your data.

0

A point is represented matching the indexes of the 'x' and 'y' coordinates lists.

Now, let's figure out the problem. You get:

ValueError: 'c' argument has 2 elements, which is not acceptable for use with 'x' with size 48, 'y' with size 48.

That means the parameter 'c' in the scatter function has to to have the same size than 'x' and 'y' coordinates lists (category5numbers and category10numbers in your case). You can't pass a list with only 2 elements, because the way the 'c' parameter works (given the fact you are discarding setting the same color to all the points, which can be done by setting c to a single color format string), is the following:

  • every point will be mapped to a color matching the indexes in the 'xs' 'ys' and 'c' lists. There has to be a color specification for each point ...

That said, if you give only 2 colors for 48 points, the scatter function does not know what to do!

From the scatter docs, you get that 'c' can be ...

enter image description here

So, summing up, you will have to:

  1. create a list to represent the colors
  2. fill all the 48 positions with the color representative you want
  3. Pass it to the scatter function

Check out the first part of this answer to see how you can determine the colors, and to understand what I mean by saying " same size for 'x' 'y' coordinates lists and 'c' ".

Community
  • 1
  • 1
onofricamila
  • 930
  • 1
  • 11
  • 20