4
xs=np.array([2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25])
ys=np.array([10,12,20,22,21,25,30,21,32,34,35,30,50,45,55,60,66,64,67,72,74,80,79,84])

plt.figure(figsize=(8,6))
sns.scatterplot(x=xs,y=ys,marker='o',s=ys*25,color='g',alpha=0.5)
plt.title('scatter plot')
plt.xlabel('xs value')
plt.ylabel('ys value')
plt.show()

I wanted to draw a bubble plot. I created a bubble plot before using this same code. But I recently reinstalled all anaconda and when I use this code I keep getting the error ValueError: s must be a scalar, or the same size as x and y

ashish pondit
  • 41
  • 1
  • 1
  • 2
  • `s=ys*25` - do you want to do scalar times array multiplication? or are you doing "make a new array that is ys repeated 25 times"? Generally in Python list*scalar is the latter, but dunno about numpy - maybe it was former (to be more maths form) but was changed to latter (to adhere to Python behaviour)? – h4z3 Jun 01 '20 at 11:26
  • Yup, numpy is/was supposed to do it maths style, but there's also function to do it - https://stackoverflow.com/questions/53485221/numpy-multiply-array-with-scalar second answer – h4z3 Jun 01 '20 at 11:28
  • the function `scatterplot` does this at some point `self._process_unit_info(xdata=x, ydata=y, kwargs=kwargs)` and `x = self.convert_xunits(x)` which sets x to be `[ ]` – snatchysquid Jun 01 '20 at 11:33
  • If you are making 3D plots before this, you can also run into this issue. if you did not close your 3D plot or create a new 2D plot axes before calling `plt.scatter` for 2D data, matplotlib raises error. thats what got me here. – SpaceMonkey55 May 16 '23 at 11:10

2 Answers2

4

Code works fine for me.

  • python 3.7.6
  • pandas 1.0.3

In short, using s in a plot requires that either (a) you use a scalar (single number), or (b) the length of it match the length of x and y, so each plotted point can be assigned it's own size.

The text "must be a scalar, or the same size" is intended to suggest that, but may seem cryptic.

Since others may easily encounter this error, I wanted to document the common cause and its fix here.

I encountered this error doing something like this:

df[some_filter].plot.scatter(x='x', y='y', s=df['s'])

The reason is that the x and y fields are using a subset of df (per filter on the left) but the scale parameter s does not use that filtered DataFrame. Therefore the length of s does not match the length of x & y, and scatter does not know how to size all the points.

If you encounter this situation, it may be easiest to assign the filtered DataFrame to a new variable, and then use that variable for both the scatter() call and the s parameter.

Example:

temp = df[some_filter]
temp.plot.scatter(x='x', y='y', s=temp['s'])
Mark Andersen
  • 868
  • 6
  • 8
0

I had this issue when I was trying to use a filtered column of a dataframe. So, I solved it with something like this:

import numpy as np
test_b=SOLEUR.loc[SOLEUR['type']=='buy']['cost']
y_b = test_b.astype(np.float)
plt.scatter(....s=y_b)  
ks.bi
  • 1