I have a dataframe that has columns with a repeated name. I have this code that does a forward-fill for blank values. However, it only does it on the first column found with that name.
Example dataframe:
import pandas as pd
data_dict = {'name1': [1.0, '', 2.0, '', 5.0], 'name2': [10.0, '', 14.0, 18.0, ''], 'name3': ['some string2', 'some string3', 'some string4', 'some string5', 'some string6'], 'name4': ['description2', 'description3', 'description4', 'description5', 'description6'], 'name2.1': [36.0, '', '', 44.0, ''], 'name6': ['more text2', 'more text3', 'more text4', 'more text5', 'more text6']}
df = pd.DataFrame.from_dict(data_dict)
df
Dataframe:
name1 name2 name3 name4 name2 name6
1 10 some string2 description2 36 more text2
some string3 description3 more text3
2 14 some string4 description4 more text4
18 some string5 description5 44 more text5
5 some string6 description6 more text6
Here is my code:
backfill_header_list = [
'Name1',
'Name2'
]
for i in backfill_header_list:
df.loc[:, i] = df.loc[:, i].fillna(method='ffill')
For this example, Name2
is the repeated column name.
Desired output:
name1 name2 name3 name4 name2 name6
1 10 some string2 description2 36 more text2
1 10 some string3 description3 36 more text3
2 14 some string4 description4 36 more text4
2 18 some string5 description5 44 more text5
5 18 some string6 description6 44 more text6
Is there an efficient way to have pandas iterate through all columns that match that name?