2

I am observing some strange behavior when plotting a column from my Pandas dataframe.

My data looks like the following:

    value   timestamp
0   67  2023-01-04T05:17:49+00:00
1   67  2023-01-04T05:27:57+00:00
2   66  2023-01-04T05:28:01+00:00
3   81  2023-01-04T05:28:19+00:00
4   67  2023-01-04T05:33:02+00:00
... ... ...
102 73  2023-01-04T22:22:04+00:00
103 76  2023-01-04T22:26:59+00:00
104 77  2023-01-04T22:27:12+00:00
105 75  2023-01-04T22:27:13+00:00
106 73  2023-01-04T22:32:35+00:00

Now comes the fun part:

I convert the timestamp column to a pandas datetime object with: df['timestamp_dt'] = pd.to_datetime(df['timestamp']). I create a new column so you can observe the behavior.

I then plot both of the column with seaborn:

#Plot where timestamp is "String"
sns.lineplot(data=df, x="timestamp", y="bpm")
plt.show()

enter image description here

#Plot where timestamp is "dattime object"
sns.lineplot(data=df, x="timestamp_dt", y="bpm")
plt.show()

enter image description here

As you can see, converting the timestamps into datetime object results in a weird behavior of the graph.

Why is this and how can I convert the timestamps and have a normal looking graph?

I have looked at the following solution, which suggests adding a format. However I was not able to solve it that way.

Why does changing "Date" column to datetime ruin graph?

petezurich
  • 9,280
  • 9
  • 43
  • 57
AlexPython
  • 21
  • 1
  • 1
    is your time column increasing strictly monotonically ? are there "gaps", i.e. variable frequency ? – FObersteiner Jan 09 '23 at 07:44
  • Value is independent of time, but yes there might be gaps. Value is a measurement of a sensor, that measures every other second. Could be every 10 seconds but also 2 times per second... – AlexPython Jan 09 '23 at 07:50
  • related? [Is there a Matplotlib hack to plot time series data continuously with missing hourly data?](https://stackoverflow.com/q/66508613/10197418) – FObersteiner Jan 09 '23 at 07:51
  • The solution of the post (converting to string) gives me the result that I want. But then the column is not a datime object but a string, which would be the same as "just using the column as it is". But if I understand what you mean correctly, then you are saying, this behavior is due to inconsistent timestamps? This makes perfect sense, but why can it be plotted as a string? – AlexPython Jan 09 '23 at 07:56
  • As a string, you plot a categorial variable; one element after the other. If you convert to datetime however and the frequency isn't fixed, plotted elements won't be equidistant, i.e. there will be "gaps" of varying size. If elements are connected by default, that might look weird, as in your case. – FObersteiner Jan 09 '23 at 08:02
  • Is it possible to plot the values and leave gaps where there is no data? Similar to the question of the post that you liked. The answer was to use strings, which don't leave gaps. (at least for me) – AlexPython Jan 09 '23 at 08:53
  • You could try using a scatter plot, where the points aren't connected. Alternatively, I think you might also resample to seconds frequency (1 Hz), creating NaNs where you have no data. That would remove the connecting lines between points that are apart by more than 1 second. – FObersteiner Jan 09 '23 at 09:16
  • See also [Python: Matplotlib avoid plotting gaps](https://stackoverflow.com/q/27266987/10197418) - this is for matplotlib, not sure if it works equally with seaborn – FObersteiner Jan 09 '23 at 10:17

0 Answers0