0

I am currently doing some analysis on Amazon's item ratings using pandas and matplotlib.

I am using the following code to generate a plot

result = ratings.copy()

result["rating"].plot(kind="hist")

fname = os.path.join("..", "figs", "my_figure.png")
plt.savefig(fname)

Where ratings is a dataframe that looks like this

0       A2VNYWOPJ13AFP  0981850006     5.0  1259798400
1       A20DWVV8HML3AW  0981850006     5.0  1371081600
2       A3RVP3YBYYOPRH  0981850006     5.0  1257984000
3       A28XY55TP3Q90O  0981850006     3.0  1314144000
4       A3VZW1BGUQO0V3  0981850006     3.0  1308268800
5       A2R9T5D7UVQZB0  0981850006     5.0  1253577600
6       A2MH49GAEWEI95  0981850006     5.0  1395532800
7        AR5DPX4ZU3D4Z  144072007X     1.0  1360886400

Unfortunately, my generated plot looks like this (notice how the bars are not all lined up)

enter image description here

How do I make it so that all the bars are centered (e.g. like 4.0 and 2.0)

Community
  • 1
  • 1
AlanSTACK
  • 5,525
  • 3
  • 40
  • 99
  • 1
    It seems you only consider integers as possible outcomes. Hence you need to specify the histogram bins accordingly. – ImportanceOfBeingErnest Feb 06 '19 at 21:32
  • @ImportanceOfBeingErnest Thats not my question. Even after specifying bins `[1, 2, 3, 4, 5]` the graph is still unaligned. Some bars are to the right of the axis, some to the left. – AlanSTACK Feb 06 '19 at 21:34
  • how about `result["rating"].hist(bins=5)`? if you provide a list, it takes those as the bin edges, and you need to make sure the integer ratings fall in-between the bin edges like so: `bins=[.5, 1.5, 2.5, 3.5, 4.5, 5.5]` – Aaron Feb 06 '19 at 21:36

0 Answers0