2

I have a dataframe of (1500x11). I have to select each of the 15 rows and take mean of every 11 columns separately. So my final dataframe should be of dimension 100x11. How to do this in Python.

  • i think this could help https://stackoverflow.com/questions/36810595/calculate-average-of-every-x-rows-in-a-table-and-create-new-table –  Sep 25 '20 at 08:24

3 Answers3

1

Don't know much about pandas, hence I've coded my next solution in pure numpy. Without any python loops hence very efficient. And converted result back to pandas DataFrame:

Try next code online!

import pandas as pd, numpy as np

df = pd.DataFrame([[i + j for j in range(11)] for i in range(1500)])
a = df.values
a = a.reshape((a.shape[0] // 15, 15, a.shape[1]))
a = np.mean(a, axis = 1)
df = pd.DataFrame(a)
print(df)
Arty
  • 14,883
  • 6
  • 36
  • 69
1

The following should work:

dfnew=df[:0]
for i in range(100):
    df2=df.iloc[i*15:i*15+15, :]
    x=pd.Series(dict(df2.mean()))
    dfnew=dfnew.append(x, ignore_index=True)

print(dfnew)
IoaTzimas
  • 10,538
  • 2
  • 13
  • 30
  • One column of my dataframe is Datetime. How to get its mean. The above method displays it as "NaT". TIME 2020-04-01 06:01:00 2020-04-01 06:02:00 2020-04-01 06:03:00 2020-04-01 06:04:00 ....................................... TIME 2020-04-02 17:42:00 2020-04-02 17:43:00 2020-04-02 17:44:00 2020-04-02 17:45:00 2020-04-02 17:46:00 – Ankit Kumar Singh Sep 25 '20 at 09:36
  • Take a look here: https://stackoverflow.com/questions/27907902/datetime-objects-with-pandas-mean-function – IoaTzimas Sep 25 '20 at 09:38
0

You can use pandas.DataFrame.

Use a for loop to compute the means and create a counter which should be reseted at every 15 entries.

columns = [col1, col2, ..., col12]
for columns, values in df.items():
    # compute mean
    # at every 15 entries save it

Also, using pd.DataFrame() you can create the new dataframe.

I'd recommend you to read the documentation. https://pandas.pydata.org/pandas-docs/stable/reference/frame.html

prody
  • 194
  • 1
  • 11