0

I´m having this issue, and I can´t find the solution.

I have this data frame:

Date, Product, Value1, Value2
13-3-2020, A, 10, 15
13-4-2020, A, 11, 26
13-5-2020, A, 14, 14
2-2-2018, B, 10, 15
18-2-2018, B, 11, 26
5-4-2018, B, 14, 14
5-5-2018, B, 12, 12

And I would like to fill with the first day of the month in the dates between each product, so for each product would be a different "from date" and "until" date. The output i want to achieve would be like this:

Date, Product, Value1, Value2
13-3-2020, A, 10,
15 1-4-2020, A, NA, NA
13-4-2020, A, 11, 26
1-5-2020, A, NA, NA
13-5-2020, A, 14, 14
2-2-2018, B, 10, 15
18-2-2018, B, 11, 26
1-3-2018, B, NA, NA
1-4-2018, B, NA, NA
5-4-2018, B, 14, 14
1-5-2018, B, NA, NA
5-5-2018, B, 12, 12

Any help is very appreciated.

Javad Bahoosh
  • 400
  • 1
  • 3
  • 16

1 Answers1

0

I think the following code will do the job for you. You can change freq='MS' parameter if you want different intervals.

[javad@ThinkPad ~ ]$ cat pandas_fill_first_month.py
from io import StringIO
import pandas as pd
data = StringIO("""\
Date, Product, Value1, Value2
13-3-2020, A, 10, 15
13-4-2020, A, 11, 26
13-5-2020, A, 14, 14
2-2-2018, B, 10, 15
18-2-2018, B, 11, 26
5-4-2018, B, 14, 14
5-5-2018, B, 12, 12""")
df = pd.read_table(data, sep=', ', parse_dates=[0])

month_df = pd.Series(pd.date_range(start=min(df['Date']), end=max(df['Date']), freq='MS'), name="Date").to_frame()
df = df.append(month_df, ignore_index=True)
df = df.sort_values(by='Date').reset_index(drop=True)
print(df)

[javad@ThinkPad ~ ]$ python3 pandas_fill_first_month.py
         Date Product  Value1  Value2
0  2018-02-02       B    10.0    15.0
1  2018-02-18       B    11.0    26.0
2  2018-03-01     NaN     NaN     NaN
3  2018-04-01     NaN     NaN     NaN
4  2018-05-01     NaN     NaN     NaN
5  2018-05-04       B    14.0    14.0
6  2018-05-05       B    12.0    12.0
7  2018-06-01     NaN     NaN     NaN
8  2018-07-01     NaN     NaN     NaN
9  2018-08-01     NaN     NaN     NaN
10 2018-09-01     NaN     NaN     NaN
11 2018-10-01     NaN     NaN     NaN
12 2018-11-01     NaN     NaN     NaN
13 2018-12-01     NaN     NaN     NaN
14 2019-01-01     NaN     NaN     NaN
15 2019-02-01     NaN     NaN     NaN
16 2019-03-01     NaN     NaN     NaN
17 2019-04-01     NaN     NaN     NaN
18 2019-05-01     NaN     NaN     NaN
19 2019-06-01     NaN     NaN     NaN
20 2019-07-01     NaN     NaN     NaN
21 2019-08-01     NaN     NaN     NaN
22 2019-09-01     NaN     NaN     NaN
23 2019-10-01     NaN     NaN     NaN
24 2019-11-01     NaN     NaN     NaN
25 2019-12-01     NaN     NaN     NaN
26 2020-01-01     NaN     NaN     NaN
27 2020-02-01     NaN     NaN     NaN
28 2020-03-01     NaN     NaN     NaN
29 2020-03-13       A    10.0    15.0
30 2020-04-01     NaN     NaN     NaN
31 2020-04-13       A    11.0    26.0
32 2020-05-01     NaN     NaN     NaN
33 2020-05-13       A    14.0    14.0
Javad Bahoosh
  • 400
  • 1
  • 3
  • 16
  • 1
    Hi Javad, thank you for your answer but it´s not what I´m looking for. I need the date to be fill by product. So the timespan would be different and will check by every product and will change just in the interval between minimun and maximum date. So for the product A date limits will be 3/2020 - 5/2020 but for the product B is 2/2018, 5/2018. – descendents84 Jun 20 '20 at 11:49
  • @descendents84 I think you can reach your desired data structure by filtering and dropping some columns from this code final result. – Javad Bahoosh Jun 20 '20 at 11:51