Filter pandas GroupBy output in a single step (method chaining)

Question

I would like to filter the results of a pandas groupBy directly, without having to store the groupBy result in a variable first. For example:

df = pd.DataFrame([("a", 1)]*3+[("b", 1)]*2+[("c", 1)], columns=["title", "counts"])

res = df.groupby("title").agg({"counts":"sum"}) # I want to skip creating res

my_res = res.loc[res.counts >2]

In the above example, I would like to create my_res with an one-liner. In Spark/Scala this can be achieved simply by chaining a filter operation, but in pandas filter has a different purpose.

`df.groupby("title").agg({"counts":"sum"}).query('counts > 2')` — cs95, Feb 06 '19 at 10:51
I would also recommend taking a look at [this](https://stackoverflow.com/q/53779986/4909087) post of mine. — cs95, Feb 06 '19 at 10:52
Thank you, it does the work. If you want to post an answer, I will accept it; I will check the recommended post also — geompalik, Feb 06 '19 at 10:55

score 2 · Accepted Answer · answered Feb 06 '19 at 12:00

2

Use query to chain this step:

df.groupby("title").agg({"counts":"sum"}).query('counts > 2')

       counts
title        
a           3

answered Feb 06 '19 at 12:00

cs95

379,657
97
704
746

Filter pandas GroupBy output in a single step (method chaining)

1 Answers1