3

please I think I have a simple question but I can't see any helpful blog showing how to achieve this. I have a python pandas series called "series" and I use series.hist() to visualize the histogram. I would need to visualize directly on the graph the number of occurrences for each bin but I can't find a solution for this.

how can i see on top of each bin a label showing the number of occurrences for each bin?

to be precise, this is my code:

import matplotlib.pyplot as plt
your_bins=10
data = [df_5m_9_4pm.loc['2017-6']['sum_daily_cum_ret'].values]
plt.hist(data, binds = your_bins)
arr = plt.hist(data,bins = your_bins)
for i in range(your_bins):
    plt.text(arr[1][i],arr[0][i],str(arr[0][i]))

and if I simply print the variable "data" this is how it looks like:

[array([ 0.        ,  0.03099187, -0.00417244, ..., -0.00459067,
         0.0529476 , -0.0076605 ])]

if I run the code above, I get the error message:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-97-917078981b1d> in <module>()
      2 your_bins=10
      3 data = [df_5m_9_4pm.loc['2017-6']['sum_daily_cum_ret'].values]
----> 4 plt.hist(data, binds = your_bins)
      5 arr = plt.hist(data,bins = your_bins)
      6 for i in range(your_bins):

~/anaconda3/lib/python3.6/site-packages/matplotlib/pyplot.py in hist(x, bins, range, density, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label, stacked, normed, hold, data, **kwargs)
   3002                       histtype=histtype, align=align, orientation=orientation,
   3003                       rwidth=rwidth, log=log, color=color, label=label,
-> 3004                       stacked=stacked, normed=normed, data=data, **kwargs)
   3005     finally:
   3006         ax._hold = washold

~/anaconda3/lib/python3.6/site-packages/matplotlib/__init__.py in inner(ax, *args, **kwargs)
   1708                     warnings.warn(msg % (label_namer, func.__name__),
   1709                                   RuntimeWarning, stacklevel=2)
-> 1710             return func(ax, *args, **kwargs)
   1711         pre_doc = inner.__doc__
   1712         if pre_doc is None:

~/anaconda3/lib/python3.6/site-packages/matplotlib/axes/_axes.py in hist(***failed resolving arguments***)
   6205             # this will automatically overwrite bins,
   6206             # so that each histogram uses the same bins
-> 6207             m, bins = np.histogram(x[i], bins, weights=w[i], **hist_kwargs)
   6208             m = m.astype(float)  # causes problems later if it's an int
   6209             if mlast is None:

~/anaconda3/lib/python3.6/site-packages/numpy/lib/function_base.py in histogram(a, bins, range, normed, weights, density)
    665     if mn > mx:
    666         raise ValueError(
--> 667             'max must be larger than min in range parameter.')
    668     if not np.all(np.isfinite([mn, mx])):
    669         raise ValueError(

ValueError: max must be larger than min in range parameter.
cs95
  • 379,657
  • 97
  • 704
  • 746
Andrea
  • 113
  • 1
  • 4
  • 10
  • @coldspeed - that solution in that link doesn't work on my end. I get an error message. – Andrea Dec 29 '17 at 17:55
  • the error message I get using that code is: "ValueError: max must be larger than min in range parameter." – Andrea Dec 29 '17 at 18:01
  • I've reopened your question. – cs95 Dec 29 '17 at 18:17
  • `binds`? Maybe `bins`? – Georgy Dec 29 '17 at 18:58
  • @Georgy that was a typo when I pasted here. Good catch but the original code doesn't have "binds". it still doesn't work. Otherwise, would you guys know at least a way the array with the values of bin counts? – Andrea Dec 29 '17 at 19:05
  • From the code and your print of `data` it looks like you've enclosed your data in a list--so you're passing a list of an array instead of an array of values. Try `plt.hist(data[0], bins = your_bins)` instead. Or better yet just drop the brackets when assigning `data` – Patrick O'Connor Dec 29 '17 at 19:23

1 Answers1

5

Try this:

import matplotlib.pyplot as plt              
import numpy as np                                       


x = np.random.normal(size = 1000)                                         
counts, bins, patches = plt.hist(x, normed=True)
plt.ylabel('Probability')

# Label the raw counts and the percentages below the x-axis...
bin_centers = 0.5 * np.diff(bins) + bins[:-1]
for count, x in zip(counts, bin_centers):
    # Label the raw counts
    plt.annotate('{:.2f}'.format(count), xy=(x, 0), xycoords=('data', 'axes fraction'),
        xytext=(0, 18), textcoords='offset points', va='top', ha='center')

plt.show()

Labeled bins

If you want raw occurrences instead of frequencies, just remove normed=True and maybe change the formatting string.

I might add that you could have solved this too by basically just copying the code in the question linked in the sidebar and changing (0, -18) to (0, 18).

nnnmmm
  • 7,964
  • 4
  • 22
  • 41
  • What if I'm plotting the frequencies but want to show the raw count as data labels for the bars - is there a way to do that? – Chipmunk_da Jun 07 '20 at 13:19