I thought a very explanatory approach.
First, we need a function that recognizes the date type. I didn't understand if there is a specific format in your csv, so when in doubt we will use a function that recognizes any pattern.
Check out 'Check if string has date, any format':
from dateutil.parser import parse
def is_date(string, fuzzy=False):
try:
parse(string, fuzzy=fuzzy)
return True
except ValueError:
return False
At this point, we can iterate for each row in your dataframe and where there is no value in the right column, we search on all the next ones.
sub_df = df.iloc[:, df.columns.str.find("Opening Data").argmax()+1:] # retrieve only remaining columns
for index, row in df.iterrows():
if not row['Opening Data']:
for col in sub_df.columns:
if is_date(row[col]):
df.iloc[index]['Opening Data'] = row[col]
df.iloc[index][col] = ''
Starting from a dataset of this form:
|
Opening Data |
col_0 |
col_1 |
0 |
01-01-2000 00:00:00 |
|
|
1 |
|
02-01-2000 00:00:00 |
|
2 |
|
|
03-01-2000 00:00:00 |
the output will be:
|
Opening Data |
col_0 |
col_1 |
0 |
01-01-2000 00:00:00 |
|
|
1 |
02-01-2000 00:00:00 |
|
|
2 |
03-01-2000 00:00:00 |
|
|