I am trying to read the information in a CSV file using pandas, the CSV has two columns, a timestamp (example: 31/12/2022 21:30) and a float value. This file is generate by an intrument that measures energy compsuntion per quater of an hour. This device sometimes can malfuntion and it would not register some quarters, and i will like to fill that gaps with the time and a 0 value (or the mean of the next value by all the skip quarter plus 1). For that I try to used the solution given here:
https://stackoverflow.com/questions/69205577/fill-gaps-in-time-series-pandas-dataframe
But I could not convert to an index because of duplicated values the day 30-10-2022 due to daylight savings. enter image description here
I tried to convert to the timezone, but it doesn´t seems to work:
telemedida_df=pd.read_csv("filename.csv",delimiter=";",decimal=",")
telemedida_df["Fecha/hora"]=pd.to_datetime(telemedida_df["Fecha/hora"],format='%d/%m/%Y %H:%M',utc=True)
telemedida_df["Fecha/hora"] = telemedida_df["Fecha/hora"].dt.tz_convert('Europe/Madrid')
df_new = (telemedida_df.assign(date=telemedida_df["Fecha/hora"].dt.date) #create new col 'date' from the timestamp
.set_index("Fecha/hora") #Error here
.groupby('date')
.apply(lambda x: x.resample('15Min')
.ffill())
.reset_index('date', drop=True)
.drop('date',1)
.reset_index()
)
The following error shows "cannot reindex a non-unique index with a method or limit". Before the error the Dataframe looks like: Dataframe before error
As you can see I am not getting +01 and +02 in the "Fecha/hora" column.
How could I take transfort datetime to index here taking into account the daylight savings?
Example of CSV DATA (It uses Spanish comma notation insted of US dot for decimal) :
Fecha/hora;Consumo
30/10/2022 0:15;6,9
30/10/2022 0:30;6,6
30/10/2022 0:45;6,7
30/10/2022 1:00;6,7
30/10/2022 1:15;6,6
30/10/2022 1:30;6,9
30/10/2022 1:45;6,8
30/10/2022 2:00;6,7
30/10/2022 2:15;6,6
30/10/2022 2:30;6,7
30/10/2022 2:45;6,7
30/10/2022 2:00;6,7
30/10/2022 2:15;6,7
30/10/2022 2:30;6,8
30/10/2022 2:45;6,7
30/10/2022 3:00;6,8
30/10/2022 3:15;6,8
30/10/2022 3:30;6,6
30/10/2022 3:45;7
30/10/2022 4:00;6,6
30/10/2022 4:15;6,8
30/10/2022 4:30;6,6
30/10/2022 4:45;6,7
30/10/2022 5:00;6,6
I tried different ways of converting to the timezone without any result