I have many signals being logged based on change. When retrieving the data by reading from avro, I read data into a list of lists and then create a dataframe.
I use ‘groupby’ to get data for different signals and would like to plot the different signals in the same plot. The different signals have a different number of entries, and this is causing me great trouble. I have created a simplified example to work with when trying to solve this.
print(df1)
ts value
0 2019-10-18T08:13:26.790000 6
1 2019-10-18T08:13:26.889000 7
2 2019-10-18T08:13:26.901000 10
3 2019-10-18T08:13:27.098000 1
4 2019-10-18T08:13:27.188000 8
5 2019-10-18T08:23:26.527000 13
6 2019-10-18T08:23:26.725000 12
print(df2)
ts value
0 2019-10-18T08:23:26.375000 12.0
1 2019-10-18T08:23:26.527000 7.0
2 2019-10-18T08:23:26.575000 8.0
3 2019-10-18T08:23:26.725000 6.0
I go:
ax=plt.gca()
df1.plot(ax=ax, x='ts', y='value', c='xkcd:burgundy', legend=True)
df2.plot(ax=ax, x='ts', y='value', c='xkcd:baby blue', legend=True)
plt.gcf().autofmt_xdate()
Result:
I don't know if my issue is with the datetime object (original data is from azure), or if the issue is that the number of entries is not the same.
As seen, the equal timestamps are not plotted correctly. It seem the plot is taking entry by entry? I am also wondering why the datetime on x-axis is not shown for the last entries of df1?
I then try something different:
ax=plt.gca()
ax2=ax.twiny()
df1.plot(ax=ax, x='ts', y='value', c='xkcd:burgundy')
df2.plot(ax=ax2, x='ts', y='value', c='xkcd:baby blue',
secondary_y=True)
df2.plot(ax=ax, x='ts', y='value', c='xkcd:mustard')
plt.gcf().autofmt_xdate()
plt.show()
Result:
Here I have tried to plot df2 on a secondary axis, and on the same axis to see what happens. Maybe I don’t understand what I am actually asking for here, but the result is not what I want.
I tried twinx() as I thought this was more logical, but no luck then either. Twiny() is bringing me closer, sort of (and I don't know why). But I want all my signals to have the same x- axis, and be plotted correctly in relation to each other, no matter how many entries they have. What to do?
In my real task I would need to plot several signals with the same y-axis, and some signals with a secondary y-axis as the signals values are on different levels so to speak. So a solution that works with both is very welcome.
I need to see my signals in the same plot to see the interaction, and then I need a common x-axis to be correct. What can I do?
Is there some overall smarter way to do this? '
EDIT 1 - after comments from Parfait
I read data from avro files, my original df lookes like this:
After just sorting the signals with group by, I got trouble when I started plotting. That is my original topic. I therefore create some simple dataframes with some of the ‘ts’ data, with corresponding simple values, to be able to manually see if the plots are represented correctly when plotted in the same plot.
After comments from Parfait, I loop through and try to change ‘ts’ to datetime:
It does not return datetime, but timestamp. I started reading about this, and it seems others are having the same problem. Datetime is altered to timestamp in a df column.
I create some simple dfs to help me find a solution without working with to big datasets:
I then try to convert once more, timestamp is still the output. And as seen below, the plot is wrong.
This is driving me crazy. After reading a lot, I found this post, and it seems this is a known problem. I then posted a new question requesting a work around, hoping 'ts' as index will solve the problem and keep the datetime format.
But Parfait, you state this is working for you. Is my problem clear to you now? What is your solution?
Thanks for all help! Anything bringing me closer to solving this is very helpful.