0

I have a text file that includes time series data, but there are some gaps in the time series and values. ( i only insert the first 5 rows of data as example the time series is from 1996 to 2010)

o_data is a (dataframe):

   Time           Value
01.01.1996 00:00       nan
01.01.1996 00:10       10.4
01.01.1996 00:20       10.4
01.01.1996 00:50       10.4

I create a time series with Freq 10 min:

idx = pd.date_range(start=min(o_data.Time), end=max(o_data.Time), freq='10Min')

out[ ]: idx --> (DatetimeIndex)

           0
01.01.1996 00:00
01.01.1996 00:10
01.01.1996 00:20
01.01.1996 00:30
01.01.1996 00:40
01.01.1996 00:50

and I want to assign the values from (o_data)Dataframe to time index which has been created already and fill value gaps by NaN:

new_o_data = (pd.DataFrame( o_data, index=idx ).fillna('NaN'))

the desired result is: (the result that I want to have)

        Time        Value
01.01.1996 00:00     NaN
01.01.1996 00:10     10.4
01.01.1996 00:20     10.4
01.01.1996 00:30      NaN
01.01.1996 00:40      NaN
01.01.1996 00:50      10.4

but what I received after running the code are empty columns of Time and Value:

out[ ]: new_o_data --> (DataFrame)

        index            Time       Value
1996-01-01 00:00:00       NaN        NaN
1996-01-01 00:10:00       NaN        NaN
1996-01-01 00:20:00       NaN        NaN
1996-01-01 00:30:00       NaN        NaN
1996-01-01 00:40:00       NaN        NaN
1996-01-01 00:50:00       NaN        NaN

I would appreciate it if you could help me.

Jonathan Leon
  • 5,440
  • 2
  • 6
  • 14
  • Can you re-create the problem using just a sample dataset that you can embed in the code? that will make it a lot easier for others to help you. : https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – robertwest Dec 14 '20 at 23:29

1 Answers1

0

Not exactly sure how your data is setup but I had to change data types and use reindex to get your result

d = """Time,Value
01.01.1996 00:00,nan
01.01.1996 00:10,10.4
01.01.1996 00:20,10.4
01.01.1996 00:50,10.4"""

o_data = pd.read_csv(io.StringIO(d), sep=',')
o_data['Time'] = pd.to_datetime(o_data['Time']) # change data type
o_data.set_index('Time', inplace=True) # make Time the index
idx = pd.date_range(start=min(o_data.index), end=max(o_data.index), freq='10Min') # new range based on index (now that Time is the index)
new_o_data = o_data.reindex(idx, fill_value=np.nan) # reindex

                     Value
1996-01-01 00:00:00    nan
1996-01-01 00:10:00 10.400
1996-01-01 00:20:00 10.400
1996-01-01 00:30:00    nan
1996-01-01 00:40:00    nan
1996-01-01 00:50:00 10.400

Second method using join instead of reindex

d = """Time,Value
01.01.1996 00:00,nan
01.01.1996 00:10,10.4
01.01.1996 00:20,10.4
01.01.1996 00:50,10.4"""

o_data = pd.read_csv(io.StringIO(d), sep=',')
o_data['Time'] = pd.to_datetime(o_data['Time'])
o_data.set_index('Time', inplace=True)
dftemp = pd.DataFrame(index=pd.date_range(start=min(o_data.index), end=max(o_data.index), freq='10Min'))
new_o_data = dftemp.join(o_data)

Output

In [11]: new_o_data
Out[11]:
                     Value
1996-01-01 00:00:00    NaN
1996-01-01 00:10:00   10.4
1996-01-01 00:20:00   10.4
1996-01-01 00:30:00    NaN
1996-01-01 00:40:00    NaN
1996-01-01 00:50:00   10.4
Jonathan Leon
  • 5,440
  • 2
  • 6
  • 14
  • thanks Jonathan, but there is still a problem. – Alireza Dec 15 '20 at 09:48
  • thanks, Jonathan, but there is still a problem. after running the code, the original values after the gap has been changed. for example, in row 202, there is a gap, and the value in row 203 is 10.2, in the resulted data frame, the gap is in row 202, BUT the value in row 203 changes to 12.8. i dont know why? i am really in need of urgent help, and this is my mail address: Alireza.PourzakerArabani@b-tu.de – Alireza Dec 15 '20 at 09:58
  • If you can post that data using df.to_dict() I can review it when I have chance. Update your original question with updated data. – Jonathan Leon Dec 15 '20 at 16:29
  • I am new in, how can I do this? am I allowed to mail you the txt and python file? therefore you will be able to see how it doesn't work. thanks a a lot again for your response – Alireza Dec 16 '20 at 22:31
  • Sorry. Not giving out my email. Try o_data[195:210].to_dict() and edit your question and paste in what printed to your screen. If your index is in numerical order it should pick up the rows you are talking about. Otherwise change the slice. Don't get discouraged. It's a process to learn this stuff. – Jonathan Leon Dec 16 '20 at 23:54
  • try using join. i added that to my answer. If you're still getting an error, you may have a data issue you have to correct first before creating a new index. good luck! – Jonathan Leon Dec 17 '20 at 04:29