0

Hi everyone I have a dataframe starting with date 2019-10-03 and ends with 2019-10-28. I want my dataframe with date 2019-10-01 to 2019-10-31. with last column section filled with the present dataframe last column median. Need the correct way to do this.

indexDate                                                                                                    
2019-10-03 2019-10-03     NEFT-LB3670191003Z508-MR PRASAD GEETHA- BAD...   deposit  103300.0         476853.6
2019-10-04 2019-10-04     NEFT-N277190945376455-FINSHERPA INVESTMENT-...   deposit  103300.0         779542.6
2019-10-05 2019-10-05    NEFT-SAA68109135-THE METAL POWDER COMPANY- L...   deposit   39400.0         349057.6
2019-10-06 2019-10-05    NEFT-SAA68109135-THE METAL POWDER COMPANY- L...   deposit   39400.0         349057.6
2019-10-07 2019-10-07     NEFT-N280190947374122-PRAVIN L RAJENDRAN-PR...   deposit  103019.0         452076.6
2019-10-08 2019-10-07     NEFT-N280190947374122-PRAVIN L RAJENDRAN-PR...   deposit  103019.0         452076.6
2019-10-09 2019-10-09    NEFT-PUNBH19282586104-BHARATH LAJHNA MULTI- ...   deposit   40000.0         230377.6
2019-10-10 2019-10-10     MMT/IMPS/928316107218/Japan visa/M CHARLES ...   deposit    9200.0         264279.6
2019-10-11 2019-10-11     NEFT-N284190952298833-MADRAS TALKIESPROJECT...   deposit   19885.0         358396.6
2019-10-12 2019-10-11     NEFT-N284190952298833-MADRAS TALKIESPROJECT...   deposit   19885.0         358396.6
2019-10-13 2019-10-11     NEFT-N284190952298833-MADRAS TALKIESPROJECT...   deposit   19885.0         358396.6
2019-10-14 2019-10-14     RTGS:ICICR52019101400532057/TMBL0000165/FLY...  withdraw  200000.0         158396.6
2019-10-15 2019-10-15     MMT/IMPS/928816129663/Refund of exces/SKS- ...   deposit    1180.0         175076.6
2019-10-16 2019-10-16     NEFT-MSNUH19289000695-AMIT ACETYLENE PVT LT...   deposit   32700.0         207776.6
2019-10-17 2019-10-17           26482 TRFR TO: HERMES I TICKETS PVT LTD   withdraw  200000.0         161076.6
2019-10-18 2019-10-18              26485 TRFR TO:CHOICE TRAVEL SERVICES   withdraw   13653.0         408006.6
2019-10-19 2019-10-19           26486 TRFR TO: HERMES I TICKETS PVT LTD   withdraw  190000.0         276772.6
2019-10-20 2019-10-19           26486 TRFR TO: HERMES I TICKETS PVT LTD   withdraw  190000.0         276772.6
2019-10-21 2019-10-21    NEFT-PUNBH19294588987-BHARATH LAJHNA MULTI- ...   deposit   50000.0         216772.6
2019-10-22 2019-10-22    BIL/INFT/001828963217/Australia Tkts/ NARASI...   deposit   56400.0         226972.6
2019-10-23 2019-10-23           26489 TRFR TO: HERMES I TICKETS PVT LTD   withdraw  125000.0         437396.6
2019-10-24 2019-10-24          - BIL/INFT/001830247383/NA/ ARULBARATH A    deposit   68346.0         220186.6
2019-10-25 2019-10-25     NEFT-N298190964842360-TIBRO TOURS PRIVATE-P...   deposit   30000.0         187486.6
2019-10-26 2019-10-25     NEFT-N298190964842360-TIBRO TOURS PRIVATE-P...   deposit   30000.0         187486.6
2019-10-27 2019-10-25     NEFT-N298190964842360-TIBRO TOURS PRIVATE-P...   deposit   30000.0         187486.6
2019-10-28 2019-10-28                    26493 TRFR TO: YESUDAS PETER S   withdraw   33457.0         154029.6
Roshan
  • 27
  • 1
  • 6

2 Answers2

0

I think Calculate new column as the mean of other columns pandas will answer you question. Instead of mean you need to calculate median.

This can guide you to calculate median: https://www.geeksforgeeks.org/python-pandas-dataframe-median/

Gaurav Agarwal
  • 611
  • 6
  • 18
  • need to create two rows with index 2019-10-01 and 2019-10-02 with last column value as a median of present dataframe and same in the last need three dates as new rows. – Roshan Jan 13 '20 at 11:11
0

You can use reindex method of dataframe with DatetimeIndex and fill it with median values:

import pandas as pd
idx = pd.date_range('2019-10-01', '2019-10-31')
s = pd.Series({'2019-10-03': 476853.6,
               '2019-10-04': 1 779542.6,
               '2019-10-25': 187486.6,
               '2019-10-25': 154029.6})
s.index = pd.DatetimeIndex(s.index)

s = s.reindex(idx, fill_value=s.median())
print(s)
kotbegemot
  • 71
  • 5
  • date_range need the first date and last date. so each dataframe last date of that month and the first date of the month I have to find. – Roshan Jan 13 '20 at 11:04
  • I assumed you already know date range to be filled with medians according to this statement: > _want my dataframe with date 2019-10-01 to 2019-10-31_ – kotbegemot Jan 13 '20 at 11:11
  • yes, in this case, it's fixed, but this dataframe can be of any month and the last date of the month will be according to the month. – Roshan Jan 13 '20 at 11:15
  • 1
    If you need this to work for various months you may want to use `pandas.tseries.offsets.MonthEnd` functionality. This is a slightly different question, and it is well explained here: [https://stackoverflow.com/questions/37354105/find-the-end-of-the-month-pandas-dataframe-series] – kotbegemot Jan 13 '20 at 11:19
  • So the solution will be two steps. First I have to find the first date and the last date of the month then to use reindex with fill_value. – Roshan Jan 14 '20 at 05:26
  • Yes, that should work for your question. You can also try to reindex several months at once if you want to. – kotbegemot Jan 14 '20 at 06:19
  • "You can also try to reindex several months at once if you want to" means by giving date_range more then 1 month. – Roshan Jan 14 '20 at 07:06
  • Yes, you can do that too. – kotbegemot Jan 16 '20 at 07:58