1
age_cleaned_titanic_df.groupby('Age_group').mean()
age_cleaned_titanic_df.groupby('Age_group').get_group((0,10])

The get_group function gives me errors as the 'Age_group' column has values which are half-open indices of bins. (0,10] (10,20] ... ... (70,80].

How do I then perform the get_group() method? Most of the examples in the documentation and stackoverflow talk about columns values which are strings/numbers where get_group() becomes straightforward. How to do it when the groupby column is a category?

Pradyut Vatsa
  • 101
  • 2
  • 9
  • 2
    did you try `age_cleaned_titanic_df.groupby('Age_group').get_group('(0,10]')`? – MaxU - stand with Ukraine Jun 01 '16 at 09:33
  • Yes, I did. It gives a key error: `KeyError: '(0,10]'` – Pradyut Vatsa Jun 01 '16 at 09:51
  • what about `age_cleaned_titanic_df.groupby('Age_group').get_group('(0, 10]')`? [how-to-make-good-reproducible-pandas-examples](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples/32536193#32536193) – MaxU - stand with Ukraine Jun 01 '16 at 10:33
  • 1
    You will have to provide a reproducible example (some runnable code that shows the problem), as normally `get_group` should work fine with categoricals (also show your pandas version). – joris Jun 01 '16 at 10:35

1 Answers1

0

there must be a space between 0, and 10 - like '(0, 10]'.

Here is a small demonstration:

df = pd.DataFrame({'age': np.random.randint(10,30,20)})
df['Age_group'] = pd.cut(df.age, bins=[10, 15, 20, 25, 30])

this works:

In [141]: df.groupby('Age_group').get_group('(10, 15]')
Out[141]:
    age Age_group
1    11  (10, 15]
6    12  (10, 15]
11   13  (10, 15]
12   14  (10, 15]
14   15  (10, 15]
15   12  (10, 15]
17   14  (10, 15]
18   13  (10, 15]

now the same, but without a white-space between values:

In [142]: df.groupby('Age_group').get_group('(10,15]')
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-142-53b04eccd579> in <module>()
----> 1 df.groupby('Age_group').get_group('(10,15]')

...

KeyError: '(10,15]'

Data:

In [139]: df
Out[139]:
    age Age_group
0    25  (20, 25]
1    11  (10, 15]
2    27  (25, 30]
3    24  (20, 25]
4    27  (25, 30]
5    10       NaN
6    12  (10, 15]
7    20  (15, 20]
8    16  (15, 20]
9    29  (25, 30]
10   21  (20, 25]
11   13  (10, 15]
12   14  (10, 15]
13   21  (20, 25]
14   15  (10, 15]
15   12  (10, 15]
16   29  (25, 30]
17   14  (10, 15]
18   13  (10, 15]
19   19  (15, 20]
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419