0

I have a large dataset based on servers at target locations. I used the following code to calculate the mean of a set of values for each server grouped by Site.

df4 = df4.merge(df4.groupby('SITE',as_index=False).agg({'DSKPERCENT':'mean'})[['SITE','DSKPERCENT']],on='SITE',how='left')

Sample Resulting DF

Site  Server           DSKPERCENT      DSKPERCENT_MEAN
A      1                12                 11
A      2                10                 11
A      3                11                 11
B      1                9                  9
B      2                12                 9
B      3                7                  9
C      1                12                 13
C      2                12                 13
C      3                16                 13

what I need now is to print/export the newly calculated mean per site. How can I print/export just the single unique calculated mean value per site (i.e. Site A has a calculated mean of 11, Site B of 9, etc...)?

odonnry
  • 189
  • 1
  • 13
  • Hi @odonnry, welcome to stackoverflow. Could you please provide a [minimal reproducible example](https://stackoverflow.com/help/minimal-reproducible-example)? For example, showing a small portion of the dataframes here (not in picture format, but a copy of the first lines), would help us solve the issue, and you'll get an answer quicker. – jlb_gouveia Nov 05 '20 at 21:54
  • Your question isn't entirely clear without sample input and output. See [How to make good pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). Are you saying you just want the grouped data by itself? Just the output of `df4.groupby('SITE',as_index=False).agg({'DSKPERCENT_GB':'mean'})`? – G. Anderson Nov 05 '20 at 21:54
  • @jib_gouveia and G. Anderson, thanks for the feedback. I will look to add an example as soon as I can but to clarify my question, Im looking to extract the calculated mean per site as a single value per site. Apologies for the grayness of my question – odonnry Nov 05 '20 at 22:18
  • @jib_gouveia...added what the df would look like. – odonnry Nov 06 '20 at 12:41

1 Answers1

0

IIUC, you're looking for a groupby -> transform type of operation. Essentially using transform is similar to agg except that the results are broadcasted back to the same shape of the original group.

Sample Data

df = pd.DataFrame({
    "groups": list("aaabbbcddddd"),
    "values": [1,2,3,4,5,6,7,8,9,10,11,12]
})

df
   groups  values
0       a       1
1       a       2
2       a       3
3       b       4
4       b       5
5       b       6
6       c       7
7       d       8
8       d       9
9       d      10
10      d      11
11      d      12

Method

df["group_mean"] = df.groupby("groups")["values"].transform("mean")

print(df)
   groups  values  group_mean
0       a       1           2
1       a       2           2
2       a       3           2
3       b       4           5
4       b       5           5
5       b       6           5
6       c       7           7
7       d       8          10
8       d       9          10
9       d      10          10
10      d      11          10
11      d      12          10
Cameron Riddell
  • 10,942
  • 9
  • 19
  • Hi @Cameron. Sorta, but what Im looking for is how I extract that one single per group (in your example) group_mean value? In other words how to I extract and place into a new df maybe, group A - group_mean = 2, group B - group_mean = 5, etc...rather then all three entries of (2) for Group A, all three entries of (5) for Group B, etc... Apologies if Im not being clear – odonnry Nov 05 '20 at 22:16