5

I am interested in constructing a sunflower scatter plot (as depicted in, for example, http://www.jstatsoft.org/v08/i03/paper [PDF link]). Before I write my own implementation, does anyone know of an existing one? I am aware of the functions in Stata and R, but am looking for one in matplotlib.

Thank you.

cytochrome
  • 549
  • 1
  • 4
  • 16
  • What does your data look like? Specifically, the sunflower plot isn't really a scatter plot since the data is positioned along a hex grid. Is yours positioned on a hexagonal grid, or do you want the sunflower shapes at non-grid positions? – tom10 Mar 04 '14 at 07:03
  • As in the example given in the paper referred to above, my data are 'scattered'. The data would, of course, have to be binned into the appropriate hexagonal grid cells. – cytochrome Mar 04 '14 at 07:13
  • Check out [`plt.hexbin`](http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.hexbin) histograms, example here: http://stackoverflow.com/a/2371812/1643946. Doesn't have the markers over the top so needs some work – Bonlenfum Mar 04 '14 at 08:43
  • Thank you. That is a great start. – cytochrome Mar 05 '14 at 00:39

1 Answers1

8

I don't know of any matplotlib implementations but it's not hard to do. Here I let hexbin do the counting, and then go through each cell and add the appropriate number of petals:

enter image description here

import numpy as np
import matplotlib.pyplot as plt
from matplotlib import colors

np.random.seed(0)
n = 2000
x = np.random.standard_normal(n)
y = 2.0 + 3.0 * x + 4.0 * np.random.standard_normal(n)

cmap = colors.ListedColormap(['white', 'yellow', 'orange'])
hb = plt.hexbin(x,y, bins='log', cmap=cmap, gridsize=20, edgecolor='gray')
plt.axis([-2, 2, -12, 12])
plt.title("sunflower plot")

counts = hb.get_array()
coords = hb.get_offsets()

for i, count in enumerate(counts):
    x, y = coords[i,:]
    count = int(10**count)
    if count>3 and count<=12:
        n = count // 1
        if n>1:
            plt.plot([x], [y], 'k.')
            plt.plot([x], [y], marker=(n, 2), color='k', markersize=18)
    if count>12:
        n = count // 5
        if n>1:
            plt.plot([x], [y], 'k.')
            plt.plot([x], [y], marker=(n, 2), color='k', markersize=18)

plt.show()

Here yellow is 1 petal = 1, and orange 1 petal = 5.

One obvious place for improvement here is working with the colormap. For example, do you want to preset the colors boundaries or calculate them from the data, etc? Here I just kludged it a bit: I used bins='log' just to get a reasonable ratio between yellow and orange cells for the particular sample I used; and also I hard coded the borders between white, yellow, and orange cells (3 and 12).

Being able to use a tuple to specify the marker characteristics in matplotlib makes it really easy to draw all the different petal numbers.

tom10
  • 67,082
  • 10
  • 127
  • 137
  • Excellent! With a few tweaks, this approach should work well for my application. – cytochrome Mar 06 '14 at 13:40
  • Great. I edited the last couple of paragraphs to make a few things more clear. (And if you publish, it would be interesting to see what you end up with.) – tom10 Mar 06 '14 at 18:15