Questions tagged [histogram]

In statistics, a histogram is a graphical representation, showing a visual impression of the distribution of data.

In statistics, a histogram is a graphical representation, showing a visual impression of the distribution of data. It is an estimate of the probability distribution of a continuous variable and was first introduced by Karl Pearson. A histogram consists of tabular frequencies, shown as adjacent rectangles, erected over discrete intervals (bins), with an area equal to the of the observations in the interval. The height of a rectangle is also equal to the frequency density of the interval, i.e., the frequency divided by the width of the interval. The total area of the histogram is equal to the number of data. A histogram may also be normalized displaying relative frequencies. It then shows the proportion of cases that fall into each of several categories, with the total area equaling 1. The categories are usually specified as consecutive, non-overlapping intervals of a variable. The categories (intervals) must be adjacent, and often are chosen to be of the same size.

Histograms are used to plot density of data, and often for density estimation: estimating the probability density function of the underlying variable. The total area of a histogram used for probability density is always normalized to 1. If the length of the intervals on the x-axis are all 1, then a histogram is identical to a relative frequency plot.

In scientific software for statistical computing and graphics, The function hist generates a histogram. It can also optionally scale it so that its total area is 1. This puts it in the right scale if one want to overlay a probability density curve.

More about it here : histogram wiki

6663 questions
356
votes
13 answers

Plot two histograms on single chart with matplotlib

I created a histogram plot using data from a file and no problem. Now I wanted to superpose data from another file in the same histogram, so I do something like this n,bins,patchs = ax.hist(mydata1,100) n,bins,patchs = ax.hist(mydata2,100) but the…
Open the way
  • 26,225
  • 51
  • 142
  • 196
268
votes
9 answers

Using a dictionary to count the items in a list

Suppose I have a list of items, like: ['apple', 'red', 'apple', 'red', 'red', 'pear'] I want a dictionary that counts how many times each item appears in the list. So for the list above the result should be: {'apple': 2, 'red': 3, 'pear': 1} How…
Sophie
  • 2,681
  • 2
  • 16
  • 3
261
votes
9 answers

How to plot two histograms together in R?

I am using R and I have two data frames: carrots and cucumbers. Each data frame has a single numeric column that lists the length of all measured carrots (total: 100k carrots) and cucumbers (total: 50k cucumbers). I wish to plot two histograms -…
David B
  • 29,258
  • 50
  • 133
  • 186
226
votes
10 answers

Histogram using gnuplot?

I know how to create a histogram (just use "with boxes") in gnuplot if my .dat file already has properly binned data. Is there a way to take a list of numbers and have gnuplot provide a histogram based on ranges and bin sizes the user provides?
mary
  • 2,577
  • 5
  • 19
  • 11
197
votes
9 answers

Bin size in Matplotlib (Histogram)

I'm using matplotlib to make a histogram. Is there any way to manually set the size of the bins as opposed to the number of bins?
Sam Creamer
  • 5,187
  • 13
  • 34
  • 49
169
votes
2 answers

Understanding TensorBoard (weight) histograms

It is really straightforward to see and understand the scalar values in TensorBoard. However, it's not clear how to understand histogram graphs. For example, they are the histograms of my network weights. (After fixing a bug thanks to…
Sung Kim
  • 8,417
  • 9
  • 34
  • 42
163
votes
14 answers

Scatterplot with marginal histograms in ggplot2

Is there a way of creating scatterplots with marginal histograms just like in the sample below in ggplot2? In Matlab it is the scatterhist() function and there exist equivalents for R as well. However, I haven't seen it for ggplot2. I started an…
Seb
  • 5,417
  • 7
  • 31
  • 50
139
votes
3 answers

How does numpy.histogram() work?

While reading up on numpy, I encountered the function numpy.histogram(). What is it for and how does it work? In the docs they mention bins: What are they? Some googling led me to the definition of Histograms in general. I get that. But…
Aufwind
  • 25,310
  • 38
  • 109
  • 154
121
votes
7 answers

Histogram Matplotlib

So I have a little problem. I have a data set in scipy that is already in the histogram format, so I have the center of the bins and the number of events per bin. How can I now plot is as a histogram. I tried just doing bins, n=hist() but it…
madtowneast
  • 2,350
  • 3
  • 22
  • 31
113
votes
4 answers

save a pandas.Series histogram plot to file

In ipython Notebook, first create a pandas Series object, then by calling the instance method .hist(), the browser displays the figure. I am wondering how to save this figure to a file (I mean not by right click and save as, but the commands needed…
GeauxEric
  • 2,814
  • 6
  • 26
  • 33
103
votes
7 answers

Fitting a density curve to a histogram in R

Is there a function in R that fits a curve to a histogram? Let's say you had the following histogram hist(c(rep(65, times=5), rep(25, times=5), rep(35, times=10), rep(45, times=4))) It looks normal, but it's skewed. I want to fit a normal curve…
boo-urns
  • 10,136
  • 26
  • 71
  • 107
101
votes
4 answers

How to have logarithmic bins in a Python histogram

As far as I know the option Log=True in the histogram function only refers to the y-axis. P.hist(d,bins=50,log=True,alpha=0.5,color='b',histtype='step') I need the bins to be equally spaced in log10. Is there something that can do this?
Brian
  • 13,996
  • 19
  • 70
  • 94
99
votes
6 answers

Plot a histogram such that bar heights sum to 1 (probability)

I'd like to plot a normalized histogram from a vector using matplotlib. I tried the following: plt.hist(myarray, normed=True) as well as: plt.hist(myarray, normed=1) but neither option produces a y-axis from [0, 1] such that the bar heights of…
user248237
96
votes
6 answers

Plotting histograms from grouped data in a pandas DataFrame

How do I plot a block of histograms from a group of data in a dataframe? For example, given: from pandas import DataFrame import numpy as np x = ['A']*300 + ['B']*400 + ['C']*300 y = np.random.randn(1000) df = DataFrame({'Letter': x, 'N': y}) I…
dreme
  • 4,761
  • 3
  • 18
  • 20
92
votes
10 answers

Getting data for histogram plot

Is there a way to specify bin sizes in MySQL? Right now, I am trying the following SQL query: select total, count(total) from faults GROUP BY total; The data that is being generated is good enough but there are just too many rows. What I need is a…
Legend
  • 113,822
  • 119
  • 272
  • 400
1
2 3
99 100