9

I'm trying to plot several (many thousands) of circle objects - I don't have much experience working with python. I'm interested in specifying the position, radius and color. Is there a more efficient way to achieve the same result?:

import matplotlib.pyplot as plt

xvals = [0,.1,.2,.3]
yvals = [0,.1,.2,.3]
rvals = [0,.1,.1,.1]

c1vals = [0,.1,0..1]
c2vals = [.1,0,.1,0]
c3vals = [.1,.1,.1,.1]

for q in range(0,4):
    circle1=plt.Circle((xvals[q], yvals[q]), rvals[q], color=[0,0,0])
    plt.gcf().gca().add_artist(circle1)
anon01
  • 10,618
  • 8
  • 35
  • 58
  • 2
    You want to use a `EllipseCollection` might do what you want http://matplotlib.org/examples/pylab_examples/ellipse_collection.html – tacaswell Sep 07 '15 at 22:00

4 Answers4

16

The key here is to use a Collection. In your case, you want to make a PatchCollection.

Matplotlib optimizes drawing many similar artists through using collections. It's considerably faster than drawing each one individually. Furthermore, the plot won't contain thousands of individual artists, only one collection. This speeds up many other miscellaneous operations that need to operate on each artist every time the plot is drawn.

scatter actually is much faster than your current approach, as it will add a collection instead of separate artists. However, it also draws markers with a size that isn't in data coordinates.

To get around that, you can use the same approach scatter does, but create the collection manually.

As an example:

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.collections

num = 5000
sizes = 0.2 * np.random.random(num)
xy = 50 * np.random.random((num, 2))

# Note that the patches won't be added to the axes, instead a collection will
patches = [plt.Circle(center, size) for center, size in zip(xy, sizes)]

fig, ax = plt.subplots()

coll = matplotlib.collections.PatchCollection(patches, facecolors='black')
ax.add_collection(coll)

ax.margins(0.01)
plt.show()

enter image description here

This renders quite smoothly for me. Just to prove that the circles are in data coordinates, note what happens if we zoom in on a narrow rectangle (note: this assumes that the aspect of the plot is set to auto):

enter image description here


If you're really focused on speed, you can use an EllipseCollection as @tcaswell suggested.

An EllipseCollection will only make one path, but will scale and translate it at draw time to be in the places/sizes you specify.

The downside is that while the size can be in data coordinates, the circle will always be a circle, even if the aspect ratio of the plot isn't 1. (i.e. the circles won't stretch as they do in the figure above).

The advantage is that it's fast.

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.collections

num = 5000
sizes = 0.4 * np.random.random(num)
xy = 50 * np.random.random((num, 2))

fig, ax = plt.subplots()

coll = matplotlib.collections.EllipseCollection(sizes, sizes,
                                                np.zeros_like(sizes),
                                                offsets=xy, units='x',
                                                transOffset=ax.transData,
                                                **kwargs)
ax.add_collection(coll)
ax.margins(0.01)
plt.show()

enter image description here

Notice the difference as we zoom in on a similar region to the second figure. The circles get bigger (the size is in data coordinates), but remain circles instead of becoming elongated. They're not an accurate representation of a circle in "data" space.

enter image description here

To give some idea of the time difference, here's the time to create and draw a figure with the same 5000 circles with each of the three methods:

In [5]: %timeit time_plotting(circles)
1 loops, best of 3: 3.84 s per loop

In [6]: %timeit time_plotting(patch_collection)
1 loops, best of 3: 1.37 s per loop

In [7]: %timeit time_plotting(ellipse_collection)
1 loops, best of 3: 228 ms per loop
Joe Kington
  • 275,208
  • 71
  • 604
  • 463
  • 2
    This question also prompted https://github.com/matplotlib/matplotlib/pull/5035 as it _seems_ like `CircleCollection` should do what you want, but it is hard-coded to be area in points^2. – tacaswell Sep 07 '15 at 23:17
  • This is a great answer. I was going to edit with time trials, but you beat me to it :). FWIW, I was comparing circles vs patch_collection and got similar ratio of numbers. Thanks! – anon01 Sep 08 '15 at 00:05
  • @tcaswell - Nice! Thanks for that! – Joe Kington Sep 08 '15 at 15:36
  • Can someone tell how to make the unfilled circles using the above PathCollection – Simrandeep Bahal Feb 02 '21 at 18:02
3

scatter is probably better for you than plt.Circle though it won't make anything run faster.

for i in range(4):
    mp.scatter(xvals[i], yvals[i], s=rvals[i])

If you can deal with the circles being the same size then mp.plot(xvals[i], yvals[i], marker='o') will be more performant.

But this is probably a matplotlib limitation, rather than a language limitation. There are excellent JavaScript libraries for plotting thousands of data points efficiently (d3.js). Maybe someone here will know of one that you can call from Python.

danodonovan
  • 19,636
  • 10
  • 70
  • 78
  • Unfortunately, scatter does not allow you to specify the radius in "data" units, but rather takes input in pixels. – anon01 Sep 08 '15 at 00:07
1

You would certainly want to move ...gca() outside of your loop. You can also use list comprehension.

fig = plt.figure()
ax = plt.gcf().gca()

[ax.add_artist(plt.Circle((xvals[q],yvals[q]),rvals[q],color=[0,0,0])) 
 for q in xrange(4)]  # range(4) for Python3

Below are some tests to generate 4,000 circles using the various methods:

xvals = [0,.1,.2,.3] * 1000
yvals = [0,.1,.2,.3] * 1000
rvals = [0,.1,.1,.1] * 1000

%%timeit -n5 fig = plt.figure(); ax = plt.gcf().gca()
for q in range(4000):
    circle1=plt.Circle((xvals[q], yvals[q]), rvals[q], color=[0,0,0])
    plt.gcf().gca().add_artist(circle1)
5 loops, best of 3: 792 ms per loop

%%timeit -n5 fig = plt.figure(); ax = plt.gcf().gca()
for q in xrange(4000):
    ax.add_artist(plt.Circle((xvals[q],yvals[q]),rvals[q],color=[0,0,0]))
5 loops, best of 3: 779 ms per loop

%%timeit -n5 fig = plt.figure(); ax = plt.gcf().gca()
[ax.add_artist(plt.Circle((xvals[q],yvals[q]),rvals[q],color=[0,0,0])) for q in xrange(4000)]
5 loops, best of 3: 730 ms per loop
Alexander
  • 105,104
  • 32
  • 201
  • 196
  • this looks like the right direction for what I want... what exactly is list comprehension? – anon01 Sep 07 '15 at 18:22
  • List comprehensions are generally more efficient than `for` loops. https://docs.python.org/2/tutorial/datastructures.html#list-comprehensions – Alexander Sep 07 '15 at 18:25
1

Not sure what you are really trying to do, or what your issues or concerns are, but here is a totally different method of plotting circles... make an SVG file like this and call it circles.svg

<?xml version="1.0" standalone="no"?>
<svg width="500" height="300" version="1.1" xmlns="http://www.w3.org/2000/svg">
  <circle cx="100" cy="175" r="200" stroke="lime"   fill="coral"  stroke-width="28"/>
  <circle cx="25"  cy="75"  r="80"  stroke="red"    fill="yellow" stroke-width="5"/>
  <circle cx="400" cy="280" r="20"  stroke="black"  fill="blue"   stroke-width="10"/>
</svg>

and pass it to ImageMagick to make into a PNG file like this:

convert circles.svg result.png

enter image description here

Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
  • thanks for the suggestion. This is part of a larger project where I'd like to manipulate the data, so I'd prefer to keep it within Python. – anon01 Sep 07 '15 at 21:56
  • 1
    To the downvoter... if you are going to downvote, you might at least have the courtesy to explain why so that we can all learn something. Within the parameters of the OP's question, this is a perfectly reasonable answer. – Mark Setchell Sep 07 '15 at 22:00