4

I would like to get the specific values by a boxplot generated in Seaborn (i.e., media, quartile). For example, in the boxplot below (source: link) Is there a any way to get the media and quartiles instead of manually estimation?

import numpy as np
import seaborn as sns
sns.set(style="ticks", palette="muted", color_codes=True)

# Load the example planets dataset
planets = sns.load_dataset("planets")

# Plot the orbital period with horizontal boxes
ax = sns.boxplot(x="distance", y="method", data=planets,
             whis=np.inf, color="c")
Omar
  • 43
  • 1
  • 1
  • 5
  • I tried ' np.median(planets) ' , I got one value, not the media of each boxplot. I will appreciate any insight. – Omar Jan 15 '16 at 19:51
  • I'd familiarize yourself with pandas groupby methods: http://pandas.pydata.org/pandas-docs/stable/groupby.html – mwaskom Jan 16 '16 at 20:18

2 Answers2

3

I would encourage you to become familiar with using pandas to extract quantitative information from a dataframe. For instance, a simple thing you could to do to get the values you are looking for (and other useful ones) would be:

planets.groupby("method").distance.describe().unstack()

which prints a table of useful values for each method.

Or if you just want the median:

planets.groupby("method").distance.median()
mwaskom
  • 46,693
  • 16
  • 125
  • 127
  • Hey @mwaskom. Is there a way to get the values of specific columns for a given quantile? For example, my df has a column 'ID'. I can do this ' `cp.groupby([cp['issue_date'].dt.month]).describe().unstack()` and obtain something like you showed above. But, for each group I would like to get the IDs that fall in a given quantile. – pceccon Jun 19 '17 at 13:46
0

Sometimes I use my data as a list of arrays instead of pandas. So for that, you might need:

min(d), np.quantile(d, 0.25), np.median(d), np.quantile(d, 0.75), max(d)
Daniel Hasegan
  • 785
  • 1
  • 8
  • 15