Pandas: Stacked dots histogram

Question

Basically, title. I want to do a histogram where bars are replaced by column of stacked dots. There is an answer to this specific question in R but I'd like to stay within python.

Any help is much appreciated :)

Edit: Added link to image example of what the final result should look like

C. Helling · Answer 1 · 2018-05-04T15:48:07.663

1

Not sure exactly what you mean by "a histogram with dots," but what you described sounds reminiscent to me of seaborn's swarmplot:

sns.swarmplot(x="day", y="total_bill", data=tips);

Swarmplot documentation here: https://seaborn.pydata.org/generated/seaborn.swarmplot.html

Upon seeing your edit, it seems like perhaps this is more of what you're looking for:

import matplotlib.pyplot as plt
import numpy as np
from collections import Counter

data = np.random.randint(10, size=100)
c = Counter(data)
d = dict(c)
l = []

for i in data:
    l.append(d[i])
    d[i] -= 1

plt.scatter(data, l)
plt.show()

I personally think the swarmplot looks a lot better, but whatever floats your boat.

edited May 04 '18 at 15:48

answered May 03 '18 at 14:48

C. Helling

1,394
6
20
34

I am afraid this is not what I meant. What I want is a histogram in which the bars are replaced by strips of dots so that literally a bar of hight 3 shows a stack of 3 dots. I have edited and added a picture. – jasikevicius23 May 04 '18 at 11:31
@jasikevicius23 As far as I know, this isn't a feature of [matplotlib.pyplot](https://matplotlib.org/api/_as_gen/matplotlib.pyplot.hist.html) or [pandas](https://pandas.pydata.org/pandas-docs/stable/visualization.html). This could perhaps be done by taking data from a histogram and reshaping it as a scatterplot. Would that work? – C. Helling May 04 '18 at 14:56

score 0 · Accepted Answer · answered May 04 '18 at 16:03

There's nothing out of the box that will do this in matplotlib or its derivatives (that I'm familiar with). Luckily, pandas.Series.value_counts() does a lot of the heavy lifting for us:

import numpy
from matplotlib import pyplot
import pandas

numpy.random.seed(0)
pets = ['cat', 'dog', 'bird', 'lizard', 'hampster']
hist = pandas.Series(numpy.random.choice(pets, size=25)).value_counts()
x = []
y = []
for p in pets:
    x.extend([p] * hist[p])
    y.extend(numpy.arange(hist[p]) + 1)

fig, ax = pyplot.subplots(figsize=(6, 6))
ax.scatter(x, y)
ax.set(aspect='equal', xlabel='Pet', ylabel='Count')

And that gives me:

Yes! This is exactly what I wanted. Too bad that there isn't a simpler a way. Thanks! — jasikevicius23, May 04 '18 at 16:31

Pandas: Stacked dots histogram

2 Answers2