2

I have what superficially appears to be a simple question, but I cannot find the answer. I have a feature in my df for which I would like to use groupby on two different categories. Here's my metacode:

df = pd.DataFrame(np.random.rand(100,2), columns=['Col1', 'Col2'] )
# Assume each series below repeats.  
df['X'] = pd.Series(['A','B','A','B',...,'A','B','A','B'])
df['Y'] = pd.Series(['X','Y','X','Y',...,'X','Y','X','Y'])

How can I use groupby to create 4 box plotsfor a particular feature in the df? (Eg, keys {'A','X'},{'B','X'},{'A','Y'},{'B','Y'}) for the each data series? I can do the following:

df['Col1'].groupby([df.X,df.Y]).describe()

...what's the analogy for box plot?

GPB
  • 2,395
  • 8
  • 26
  • 36
  • As written you'll only ever get 2 combinations ( (A,X) and (B,Y) ). But as for those, did you look at the documentation for boxplot? What's wrong with `df.boxplot(column='Col1', by=['X', 'Y'])`? – Ajean Aug 12 '15 at 23:19
  • @Ajean - Sorry, assume the series for 'X' and 'Y' are random. Good spot. I'l try your suggestion. – GPB Aug 13 '15 at 00:08
  • @Ajean - That worked....thanks from a newbie! :-) – GPB Aug 13 '15 at 00:17

0 Answers0