0

i just want to plot an very easy histogram in python, but it doesnt workt.

I am importing the data from excel via and converting it into a pd dataframe

data = pd.read_excel ("test.xlsx",sheet_name='new_tests', na_filter=True)
data_1= pd.DataFrame(data , columns=['x','y'])

x is a list of the form 1,2,3,....,100 and y are corresponding events of my modle for example 107, 208,.... and also of course 100 entries.

Now I want to plot a simple histogram with for example 10 bins - that means in the first bin are the summed events from [1,2,....,10]=[107,208,...] but

plt.hist(x,y,bins=20) doesnt work, since the plot is empty.

  • 4
    Please include a _small_ subset of your data as a __copyable__ piece of code that can be used for testing as well as your expected output for the __provided__ data. See [MRE - Minimal, Reproducible, Example](https://stackoverflow.com/help/minimal-reproducible-example), and [How to make good reproducible pandas examples](https://stackoverflow.com/q/20109391/15497888). – Henry Ecker May 24 '21 at 15:21
  • 1
    `plt.hist(x,y,bins=20)` what are `x` and `y` here? You haven't defined them. Did you mean `data_1["x"]`? – Dan May 24 '21 at 15:25
  • @Dan Sorry! sure yes data_1['x'] –  May 24 '21 at 15:30
  • Have you tried df.plot.hist? -https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.plot.hist.html – Tom McLean May 24 '21 at 15:34
  • x and y values are always twisted –  May 24 '21 at 15:39

1 Answers1

0

From the documentation of plt.hist

matplotlib.pyplot.hist(x, bins=None, range=None, density=False, weights=None, cumulative=False, bottom=None, histtype='bar', align='mid', orientation='vertical', rwidth=None, log=False, color=None, label=None, stacked=False, *, data=None, **kwargs)

x(n,) array or sequence of (n,) arrays Input values, this takes either a single array or a sequence of arrays which are not required to be of the same length.

binsint or sequence or str, default: rcParams["hist.bins"] (default: 10) If bins is an integer, it defines the number of equal-width bins in the range.

If bins is a sequence, it defines the bin edges, including the left edge of the first bin and the right edge of the last bin; in this case, bins may be unequally spaced. All but the last (righthand-most) bin is half-open.

Aka following the documentation, you are setting the bin edges using y. You should be just inputting the recorded data values as x inthe plt.hist function and setting the number of bins. Try:

plt.hist(y, bins=20)
Tom McLean
  • 5,583
  • 1
  • 11
  • 36
  • no it doesnt work because its counting how many times for example the value "5" is in the list of y-values –  May 24 '21 at 17:43