0

I have a dataset like this:

State %internet2016 %internet2018 %white 2016 %white 2018
QLD 85 90 85 82
NSW 83 92 80 72
WA 82 88 82 78
SA 86 86 88 84
TAS 78 80 92 88
NT 48 65 76 75

My goal is to create a dataset like this

State %internet2016-2018 %white2016-2018
QLD 87.5 83.5
NSW 87.5 76
WA 85 80
SA 86 86
TAS 79 90
NT 56.5 75.5

I would like to have the mean of %internet and %white.

I tried this but it was not what I want:

df = df["%internet2016","%internet2018"].mean()
mkrieger1
  • 19,194
  • 5
  • 54
  • 65
  • Does this answer your question? [Row-wise average for a subset of columns with missing values](https://stackoverflow.com/questions/34734940/row-wise-average-for-a-subset-of-columns-with-missing-values), or better https://stackoverflow.com/questions/48366506/calculate-new-column-as-the-mean-of-other-columns-in-pandas – mkrieger1 Feb 20 '23 at 19:33

2 Answers2

0

You can easily get the mean using this

df['%internet2016-2018']=(df['%internet2016']+df['%internet2018'])/2
df['%white2016-2018']= (df['%white 2016']+df['%white 2018'])/2
df=df.drop(['%internet2016', '%internet2018', '%white 2016', '%white 2018'],axis=1)

Result

TransformedDataFrame

Joevanie
  • 489
  • 2
  • 5
0

Concatenating means of %internet and %white groups of columns:

pd.concat([df['State'], 
           df.filter(like='%internet').pipe(lambda df: df.mean(axis=1)
                                            .rename('%internet'+'-'.join(c[-4:] for c in df.columns))),
           df.filter(like='%white').pipe(lambda df: df.mean(axis=1)
                                         .rename('%white'+'-'.join(c[-4:] for c in df.columns)))], axis=1)

  State  %internet2016-2018  %white2016-2018
0  QLD                 87.5             83.5
1  NSW                 87.5             76.0
2   WA                 85.0             80.0
3   SA                 86.0             86.0
4  TAS                 79.0             90.0
5   NT                 56.5             75.5
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105