0

Basically, title. I want to do a histogram where bars are replaced by column of stacked dots. There is an answer to this specific question in R but I'd like to stay within python.

Any help is much appreciated :)

Edit: Added link to image example of what the final result should look like

2 Answers2

1

Not sure exactly what you mean by "a histogram with dots," but what you described sounds reminiscent to me of seaborn's swarmplot:

sns.swarmplot(x="day", y="total_bill", data=tips);

seaborn swarmplot example

Swarmplot documentation here: https://seaborn.pydata.org/generated/seaborn.swarmplot.html

Upon seeing your edit, it seems like perhaps this is more of what you're looking for:

import matplotlib.pyplot as plt
import numpy as np
from collections import Counter

data = np.random.randint(10, size=100)
c = Counter(data)
d = dict(c)
l = []

for i in data:
    l.append(d[i])
    d[i] -= 1

plt.scatter(data, l)
plt.show()

I personally think the swarmplot looks a lot better, but whatever floats your boat.

C. Helling
  • 1,394
  • 6
  • 20
  • 34
  • I am afraid this is not what I meant. What I want is a histogram in which the bars are replaced by strips of dots so that literally a bar of hight 3 shows a stack of 3 dots. I have edited and added a picture. – jasikevicius23 May 04 '18 at 11:31
  • @jasikevicius23 As far as I know, this isn't a feature of [matplotlib.pyplot](https://matplotlib.org/api/_as_gen/matplotlib.pyplot.hist.html) or [pandas](https://pandas.pydata.org/pandas-docs/stable/visualization.html). This could perhaps be done by taking data from a histogram and reshaping it as a scatterplot. Would that work? – C. Helling May 04 '18 at 14:56
0

There's nothing out of the box that will do this in matplotlib or its derivatives (that I'm familiar with). Luckily, pandas.Series.value_counts() does a lot of the heavy lifting for us:

import numpy
from matplotlib import pyplot
import pandas

numpy.random.seed(0)
pets = ['cat', 'dog', 'bird', 'lizard', 'hampster']
hist = pandas.Series(numpy.random.choice(pets, size=25)).value_counts()
x = []
y = []
for p in pets:
    x.extend([p] * hist[p])
    y.extend(numpy.arange(hist[p]) + 1)

fig, ax = pyplot.subplots(figsize=(6, 6))
ax.scatter(x, y)
ax.set(aspect='equal', xlabel='Pet', ylabel='Count')

And that gives me:

enter image description here

Paul H
  • 65,268
  • 20
  • 159
  • 136