4

I need to apply rolling mean to a column as showing in pic1 s3, after i apply rolling mean and set windows = 5, i got correct answer , but left first 4 rows empty,as showing in pic2 sa3.

i want to fill the first 4 empty cells in pic2 sa3 with the mean of all data in pic1 s3 up to the current row,as showing in pic3 a3.

how can i do with with an easy function besides the rolling mean method. pic1

pic2

pic3

cs95
  • 379,657
  • 97
  • 704
  • 746
Pepin Peng
  • 457
  • 1
  • 8
  • 21

4 Answers4

4

I think need parameter min_periods=1 in rolling:

min_periods : int, default None

Minimum number of observations in window required to have a value (otherwise result is NA). For a window that is specified by an offset, this will default to 1.

df = df.rolling(5, min_periods=1).mean()

Sample:

np.random.seed(1256)

df = pd.DataFrame(np.random.randint(10, size=(10, 5)), columns=list('abcde'))
print (df)
   a  b  c  d  e
0  1  5  8  8  9
1  3  6  3  0  6
2  7  0  1  5  1
3  6  6  5  0  4
4  4  9  4  6  1
5  7  7  5  8  3
6  0  7  2  8  2
7  4  8  3  5  5
8  8  2  0  9  2
9  4  7  1  5  1

df = df.rolling(5, min_periods=1).mean()
print (df)
          a         b     c         d         e
0  1.000000  5.000000  8.00  8.000000  9.000000
1  2.000000  5.500000  5.50  4.000000  7.500000
2  3.666667  3.666667  4.00  4.333333  5.333333
3  4.250000  4.250000  4.25  3.250000  5.000000
4  4.200000  5.200000  4.20  3.800000  4.200000
5  5.400000  5.600000  3.60  3.800000  3.000000
6  4.800000  5.800000  3.40  5.400000  2.200000
7  4.200000  7.400000  3.80  5.400000  3.000000
8  4.600000  6.600000  2.80  7.200000  2.600000
9  4.600000  6.200000  2.20  7.000000  2.600000
Community
  • 1
  • 1
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

So you want to add:

df['sa3'].fillna(df['s3'].mean(), inplace=True)

Hopefully I used correct column names.

zipa
  • 27,316
  • 6
  • 40
  • 58
0

You can use pandas to find the rolling mean and then fill the NaN with zero.

Use something like the following:

col = [1,2,3,4,5,6,7,8,9]
df = pd.DataFrame(col)

df['rm'] = df.rolling(5).mean().fillna(value =0, inplace=False)
print df

   0   rm
0  1  0.0
1  2  0.0
2  3  0.0
3  4  0.0
4  5  3.0
5  6  4.0
6  7  5.0
7  8  6.0
8  9  7.0
ajsp
  • 2,512
  • 22
  • 34
  • thanks, but i need more accurate numbers, but i have use 0 for those cannot be calculated, just not all of them,jazeral give the answer i intend to ask – Pepin Peng Jun 26 '18 at 13:25
0

I see, some of the answers are dealing with null and replacing them with mean and some answers are creating rolling mean but not replacing nulls with it. So i figured out the code myself and posting it here.

df['Col']= df['Col'].fillna(df['Col'].rolling(4,center=True,min_periods=1).mean())

'4' is the length of rolling window centre = True indicates that the replaced value will will consider half the value above and half values below the null values to replace.

Rahib
  • 462
  • 3
  • 10