0

I received data similar to this format

Time                    Humidity Condition
2014-09-01 00:00:00     84       Cloudy
2014-09-01 01:00:00     94       Rainy     

I tried to use df.resample('5T') but it seems the data cannot be replicated for the same hour and df.resample('5T') need the function like mean() but I do not need it.

I tried to do it this way enter image description here

But the problem is... I don't want to use 'mean', because it does not keep "Humidity" and "Condition" as original. I just want the data be

Time                    Humidity Condition
2014-09-01 00:00:00     84       Cloudy
2014-09-01 00:05:00     84       Cloudy
2014-09-01 00:10:00     84       Cloudy
.
.
.
2014-09-01 00:55:00     84       Cloudy 
2014-09-01 01:00:00     94       Rainy
2014-09-01 01:05:00     94       Rainy
2014-09-01 01:10:00     94       Rainy 
.
.
.  

Wonder is there is a way out, could ask if there is any solution to this issue? Many thanks!

KiuSandy
  • 17
  • 5
  • `it seems the data cannot be replicated for the same hour` Can you clarify this? What does it mean that the data can't be replicated? – Nick ODell Dec 02 '22 at 03:33
  • Yes, I tried to use df.resample('5T').mean() already but the data in 00:05:00 does not have the same record as 00:00:00's, it turns 00:05:00's records are nan. – KiuSandy Dec 02 '22 at 03:37
  • 1
    need chk your data : 2014-09-01 00:01:00 94 Rainy – Panda Kim Dec 02 '22 at 03:43
  • 1
    @KiuSandy When I try `df.resample('5T').mean()` on a sample of the first dataset extended to be at least 5 minutes long, that works without introducing any NaNs. Can you post a [reproducible example](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) of the problem? – Nick ODell Dec 02 '22 at 03:45
  • @NickODell Thanks for your assistance very much, it was finally solved. – KiuSandy Dec 02 '22 at 04:20

1 Answers1

1

Example

data = {'Time': {0: '2014-09-01 00:00:00', 1: '2014-09-01 01:00:00'},
        'Humidity': {0: 84, 1: 94},
        'Condition': {0: 'Cloudy', 1: 'Rainy'}}
df = pd.DataFrame(data)

df

    Time                Humidity    Condition
0   2014-09-01 00:00:00 84          Cloudy
1   2014-09-01 01:00:00 94          Rainy

Code

i make code for 20T instead 5T, becuz 5T is too short.

(df.set_axis(pd.to_datetime(df['Time']))
 .reindex(pd.date_range(df['Time'][0], freq='20T', periods=6))
 .assign(Time=lambda x: x.index)
 .reset_index(drop=True).ffill())

result:

    Time                Humidity    Condition
0   2014-09-01 00:00:00 84.0        Cloudy
1   2014-09-01 00:20:00 84.0        Cloudy
2   2014-09-01 00:40:00 84.0        Cloudy
3   2014-09-01 01:00:00 94.0        Rainy
4   2014-09-01 01:20:00 94.0        Rainy
5   2014-09-01 01:40:00 94.0        Rainy
Panda Kim
  • 6,246
  • 2
  • 12