Questions tagged [boxplot]

Boxplot is a form of displaying cardinally scaled data displaying robust summary statistics as graphical elements.

A boxplot (or a box-and-whisker plot) is a mean of displaying cardinally scaled data. The graphic displays robust summary statistics of a given dataset. These include for the box: the median, the lower quartile and the upper quartile. The enclosed whiskers are not commonly defined and may display some figure tied to the inter quartile range (e.g. IQR x 1.5) or the maxima/minima. Sometimes outliers are displayed as well.

Boxplots may easily be created by most statistical packages such as:

boxplot(rnorm(100)) #For R
boxplot(randn(100)) %For Matlab
graph box variable  'For Stata
boxplot(data)       #For matplotlib (python)
boxplot(dataframe)  #For seaborn (python)

Link:

3351 questions
104
votes
1 answer

How to set the range of y-axis for a seaborn boxplot

From the official seaborn documentation, I learned that you can create a boxplot as below: import seaborn as sns sns.set_style("whitegrid") tips = sns.load_dataset("tips") ax = sns.boxplot(x="day", y="total_bill", data=tips) My question is: how do…
Xin
  • 4,392
  • 5
  • 19
  • 15
90
votes
10 answers

matplotlib: Group boxplots

Is there a way to group boxplots in matplotlib? Assume we have three groups "A", "B", and "C" and for each we want to create a boxplot for both "apples" and "oranges". If a grouping is not possible directly, we can create all six combinations and…
bluenote10
  • 23,414
  • 14
  • 122
  • 178
90
votes
6 answers

Plot multiple boxplot in one graph

I saved my data in as a .csv file with 12 columns. Columns two through 11 (labeled F1, F2, ..., F11) are features. Column one contains the label of these features either good or bad. I would like to plot a boxplot of all these 11 features against…
Samo Jerom
  • 2,361
  • 7
  • 32
  • 38
83
votes
6 answers

Boxplots in matplotlib: Markers and outliers

I have some questions about boxplots in matplotlib: Question A. What do the markers that I highlighted below with Q1, Q2, and Q3 represent? I believe Q1 is maximum and Q3 are outliers, but what is Q2?                        Question B How does…
Amelio Vazquez-Reina
  • 91,494
  • 132
  • 359
  • 564
65
votes
5 answers

Seaborn load_dataset

I am trying to get a grouped boxplot working using Seaborn as per the example I can get the above example working, however the line: tips = sns.load_dataset("tips") is not explained at all. I have located the tips.csv file, but I can't seem to…
Arsibalt
  • 687
  • 1
  • 5
  • 8
63
votes
4 answers

Transform only one axis to log10 scale with ggplot2

I have the following problem: I would like to visualize a discrete and a continuous variable on a boxplot in which the latter has a few extreme high values. This makes the boxplot meaningless (the points and even the "body" of the chart is too…
daroczig
  • 28,004
  • 7
  • 90
  • 124
63
votes
3 answers

Subplot for seaborn boxplot

I have a dataframe like this import seaborn as sns import pandas as pd %pylab inline df = pd.DataFrame({'a' :['one','one','two','two','one','two','one','one','one','two'], 'b': [1,2,1,2,1,2,1,2,1,1], 'c':…
Edward
  • 4,443
  • 16
  • 46
  • 81
57
votes
6 answers

Put stars on ggplot barplots and boxplots - to indicate the level of significance (p-value)

It's common to put stars on barplots or boxplots to show the level of significance (p-value) of one or between two groups, below are several examples: The number of stars are defined by p-value, for example one can put 3 stars for p-value < 0.001,…
Ali
  • 9,440
  • 12
  • 62
  • 92
57
votes
3 answers

Put whisker ends on boxplot

I would like to put perpendicular lines at the ends of the whiskers like the boxplot function automatically gives.
user1762299
  • 579
  • 1
  • 4
  • 3
50
votes
4 answers

Python Matplotlib Boxplot Color

I am trying to make two sets of box plots using Matplotlib. I want each set of box plot filled (and points and whiskers) in a different color. So basically there will be two colors on the plot My code is below, would be great if you can help make…
user58925
  • 1,537
  • 5
  • 19
  • 28
48
votes
1 answer

Change or modify x axis tick labels in R using ggplot2

How can I change the names of my x axis labels in ggplot2? See below: ggbox <- ggplot(buffer, aes(SampledLUL, SOC)) + geom_boxplot() ggbox <- ggbox + theme(axis.text.x=element_text(color = "black", size=11, angle=30, vjust=.8, hjust=0.8)) ggbox<-…
derelict
  • 3,657
  • 3
  • 24
  • 29
46
votes
3 answers

How to remove the duplicate legend when overlaying boxplot and stripplot

One of the coolest things you can easily make in seaborn is boxplot + stripplot combination: import matplotlib.pyplot as plt import seaborn as sns import pandas as pd tips = sns.load_dataset("tips") sns.stripplot(x="day", y="total_bill",…
Sergey Antopolskiy
  • 3,970
  • 2
  • 24
  • 40
46
votes
1 answer

Matplotlib boxplot without outliers

Is there any way of hiding the outliers when plotting a boxplot in matplotlib (python)? I'm using the simplest way of plotting it: from pylab import * boxplot([1,2,3,4,5,10]) show() This gives me the following plot: (I cannot post the image…
Didac Busquets
  • 605
  • 1
  • 5
  • 8
45
votes
4 answers

In ggplot2, what do the end of the boxplot lines represent?

I can't find a description of what the end points of the lines of a boxplot represent. For example, here are point values above and below where the lines end. (I realize that the top and bottom of the box are 25th and 75th percentile, and the…
djq
  • 14,810
  • 45
  • 122
  • 157
44
votes
2 answers

Tweaking seaborn.boxplot

I would like to compare a set of distributions of scores (score), grouped by some categories (centrality) and colored by some other (model). I've tried the following with seaborn: plt.figure(figsize=(14,6)) seaborn.boxplot(x="centrality", y="score",…
clstaudt
  • 21,436
  • 45
  • 156
  • 239
1
2 3
99 100