Average value in field Advanced pandas

Question

I have a question on advanced pandas. Currently, my dataframe’s columns are celebrities, date (YYYY-MM-DD), and No. of followers. For each date, it will show the new no. of followers.

However, I would like to calculate the average no. of new followers from the starting date, 2020/1/1 to 2020/4/1 for each celebrity in a table format with only the celebrity and no. Of followers in the column.

How do I write a python code on this?

Thank you very much!

Try: `df.groupby(['celebrities'])['No. of followers'].mean()` — hacker315, May 10 '20 at 16:17
Does this answer your question? [Get statistics for each group (such as count, mean, etc) using pandas GroupBy?](https://stackoverflow.com/questions/19384532/get-statistics-for-each-group-such-as-count-mean-etc-using-pandas-groupby) — Joe, May 10 '20 at 17:04
Hi! Thank you for your answer Joe. This is really useful with how to calculate the mean with the groupby function. However, I'm still not too sure how to incorporate the date filter with the groupby function. — Avery, May 10 '20 at 17:53

score 0 · Answer 1 · answered May 10 '20 at 16:22

0

You can use groupby to gather all rows by celebrity.

df_grouped = df.groupby(['celebrities'])
for name, group in df_grouped:
    print(group['Followers'].avg())

This will print for each celebrity the avg number of followers. You add your filter by dates if you would like too (group[group['Date']>X]['Followers'].avg())

answered May 10 '20 at 16:22

Roim

2,986
2
10
25

Hey Roim! Thank you for the solution. Any chance I'm able to do this in one line? – Avery May 10 '20 at 17:06
hmmm I think this ```df.groupby(['celebrities'])['Followers'].avg()``` will work but I didn't try myself – Roim May 10 '20 at 20:18

score 0 · Answer 2 · answered Feb 26 '22 at 18:54

If you want to incorporate the date filter, you need to filter your dataframe first:

    df["Date"] = pd.to_datetime(df["Date"])
    start_date = '2020/1/1'
    end_date = '2020/4/1'

    mask = (df["Date"] >= start_date) & (df["Date"] <= end_date)
    df = df.loc[mask]

    grouped = df.groupby("Celebrity").agg({"No. Followers": 
    "mean"}).reset_index()

    celebrities = np.unique(grouped["Celebrity"])

    dfs = {}

    for c in celebrities:
        dfs[c] = grouped[grouped["Celebrity"] == c]

You can then access your dataframes through a dictionary through the celebrity name as the key.

Hope that helps and please let me know if this answers your question.

Average value in field Advanced pandas

2 Answers2