I need to fill the missing date down by group. Here is the code to create the data frame. i want to add the date of the fill column down only as far as the when the date of the fill column changes and only until the group 'name' changes.
data = {'tdate': [20080815,20080915,20081226,20090110,20090131,20080807,20080831,
20080918,20081023,20081114,20081207,20090117,20090203,20090219,20090305,20090318,20090501],
'name': ['A','A','A','A','A','B','B','B','B','B','B','B','B','B','B','B','B'],
'fill': [NaN,NaN,20080915,NaN,NaN,NaN,NaN,NaN,NaN,20081023,
NaN,NaN,NaN,NaN,20090219,NaN,NaN]}
df = pd.DataFrame(data, columns=['tdate', 'name', 'fill'])
df
Current data frame
tdate name fill
0 20080815 A NaN
1 20080915 A NaN
2 20081226 A 20080915
3 20090110 A NaN
4 20090131 A NaN
5 20080807 B NaN
6 20080831 B NaN
7 20080918 B NaN
8 20081023 B NaN
9 20081114 B 20081023
10 20081207 B NaN
11 20090117 B NaN
12 20090203 B NaN
13 20090219 B NaN
14 20090305 B 20090219
15 20090318 B NaN
16 20090501 B NaN
Desired output
tdate name fill
0 20080815 A NaN
1 20080915 A NaN
2 20081226 A 20080915
3 20090110 A 20080915
4 20090131 A 20080915
5 20080807 B NaN
6 20080831 B NaN
7 20080918 B NaN
8 20081023 B NaN
9 20081114 B NaN
10 20081207 B 20081023
11 20090117 B 20081023
12 20090203 B 20081023
13 20090219 B 20081023
14 20090305 B 20081023
15 20090318 B 20090219
16 20090501 B 20090219
Here is my code
df.groupby(df["name"])["fill"].fill()