0

I have this pandas dataframe with daily asset prices: Picture of head of Dataframe

I would like to create a pandas series (It could also be an additional column in the dataframe or some other datastructure) with the weakly average asset prices. This means I need to calculate the average on every 7 consecutive instances in the column and save it into a series.

Picture of how result should look like

As I am a complete newbie to python (and programming in general, for that matter), I really have no idea how to start.

I am very grateful for every tipp!

  • Welcome to StackOverflow. Please take the time to read this post on [how to provide a great pandas example](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) as well as how to provide a [minimal, complete, and verifiable example](http://stackoverflow.com/help/mcve) and revise your question accordingly. These tips on [how to ask a good question](http://stackoverflow.com/help/how-to-ask) may also be useful. – jezrael Apr 21 '18 at 06:40
  • [Please don't post images of code (or links to them)](http://meta.stackoverflow.com/questions/285551/why-may-i-not-upload-images-of-code-on-so-when-asking-a-question) – jezrael Apr 21 '18 at 06:41
  • 1
    Thank you for the infos! It's my first question, so sorry for the mess. I can do better in the future! – Malte Brunst Apr 21 '18 at 08:20
  • Glad can help! Nice day! – jezrael Apr 21 '18 at 08:21

1 Answers1

0

I believe need GroupBy.transform by modulo of numpy array create by numpy.arange for general solution also working with all indexes (e.g. with DatetimeIndex):

np.random.seed(2018)

rng = pd.date_range('2018-04-19', periods=20)
df = pd.DataFrame({'Date': rng[::-1], 
                   'ClosingPrice': np.random.randint(4, size=20)})  
#print (df)

df['weekly'] = df['ClosingPrice'].groupby(np.arange(len(df)) // 7).transform('mean')
print (df)
    ClosingPrice       Date    weekly
0              2 2018-05-08  1.142857
1              2 2018-05-07  1.142857
2              2 2018-05-06  1.142857
3              1 2018-05-05  1.142857
4              1 2018-05-04  1.142857
5              0 2018-05-03  1.142857
6              0 2018-05-02  1.142857
7              2 2018-05-01  2.285714
8              1 2018-04-30  2.285714
9              1 2018-04-29  2.285714
10             3 2018-04-28  2.285714
11             3 2018-04-27  2.285714
12             3 2018-04-26  2.285714
13             3 2018-04-25  2.285714
14             1 2018-04-24  1.666667
15             0 2018-04-23  1.666667
16             3 2018-04-22  1.666667
17             2 2018-04-21  1.666667
18             2 2018-04-20  1.666667
19             2 2018-04-19  1.666667

Detail:

print (np.arange(len(df)) // 7)
[0 0 0 0 0 0 0 1 1 1 1 1 1 1 2 2 2 2 2 2]
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • An additional question came up, maybe you can help again: How can I keep the rows representing the first day of the week (in your example: 2018-04-19, 2018-04-26, 2018-05-03,...) and delete all other rows. In the end I only have one row (with the first day of the week) per weekly price. – Malte Brunst Apr 21 '18 at 15:27