11

I'm in the process of making a scatter plot from thousands of points in python using pyplot. My problem is that they tend to concentrate in one place, and it's just a giant blob of points.

Is there some kind of functionality to make pyplot plot points up until it reaches some critical density and then make it a contour plot?

My question is similar to this one, where the example plot has contour lines, in which the color represents the density of the plotted points.

Super cool contour plot

This is what my data looks likeLots of yellow points

user
  • 5,370
  • 8
  • 47
  • 75
Matthew
  • 191
  • 1
  • 1
  • 9
  • 1
    You might not have to make a switch. If the points are loose, then the contour lines will not be too visible, but the points themselves will convey the information. However if the points are dense, as in the image above, then they will create a nice background over which the `contour` should be visible. So I suggest first using a `scatter` with filled markers, and a `contour` on top of that. You just have to define a density which you can `contour` plot. And if this doesn't work for you, *then* try doing a switch, probably to a `contourf`. – Andras Deak -- Слава Україні Nov 19 '15 at 01:36
  • why not just reducing the size of your dots? or use some transparency which will effectively give you the density as a gray scale? – Julien Nov 19 '15 at 02:20

1 Answers1

9

First, you need a density estimation of you data. Depending on the method you choose, varying result can be obtained.

Let's assume you want to do gaussian density estimation, based on the example of scipy.stats.gaussian_kde, you can get the density height with:

def density_estimation(m1, m2):
    X, Y = np.mgrid[xmin:xmax:100j, ymin:ymax:100j]                                                     
    positions = np.vstack([X.ravel(), Y.ravel()])                                                       
    values = np.vstack([m1, m2])                                                                        
    kernel = stats.gaussian_kde(values)                                                                 
    Z = np.reshape(kernel(positions).T, X.shape)
    return X, Y, Z

Then, you can plot it with contour with

X, Y, Z = density_estimation(m1, m2)

fig, ax = plt.subplots()                   

# Show density 
ax.imshow(np.rot90(Z), cmap=plt.cm.gist_earth_r,                                                    
          extent=[xmin, xmax, ymin, ymax])

# Add contour lines
plt.contour(X, Y, Z)                                                                           

ax.plot(m1, m2, 'k.', markersize=2)    

ax.set_xlim([xmin, xmax])                                                                           
ax.set_ylim([ymin, ymax])                                                                           
plt.show()

As an alternative, you could change the marker color based on their density as shown here.

Community
  • 1
  • 1
memoselyk
  • 3,993
  • 1
  • 17
  • 28
  • Hey, thanks for the answer! It looks like it's getting stuck on the 'Z = np.reshape(kernel(positions).T, X.shape)' line though. Is there some other way to do something equivalent? – Matthew Dec 23 '15 at 22:23
  • 1
    This throws the error message: `NameError: global name 'xmin' is not defined`. I *assume* it's supposed to be `min(m1)`. – AnnanFay Jan 31 '17 at 19:40
  • Yes, they are not defined in this _snippet_. They come from the reference of the `gaussian_kde` function. – memoselyk Feb 02 '17 at 16:15
  • 1
    If you swap all x's and y's inside the function you won't need to rotate to do the plotting. To me that seems a little simpler – zephyr Jun 04 '18 at 18:10