0

I have a problem with setting xaxis data in matplotlib to datetime. I have a data that looks like this:

     timestamp_group_start           date_time  avg_11006  avg_53119  avg_11995  avg_37595  avg_42740  avg_56826
0               1670104020 2022-12-03 22:47:00  18.246300  17.873331  19.998333  20.123986  18.661029  20.186845
1               1670104080 2022-12-03 22:48:00  18.246300  17.873331  19.994792  20.123986  18.686012  20.211735
2               1670104140 2022-12-03 22:49:00  18.246300  17.873331  20.024194  20.123986  18.686012  20.248527
3               1670104200 2022-12-03 22:50:00  18.246300  17.873331  20.013761  20.123986  18.686012  20.248527
4               1670104260 2022-12-03 22:51:00  18.246300  17.873331  20.062500  20.123986  18.686012  20.248527
..                     ...                 ...        ...        ...        ...        ...        ...        ...
256             1670119380 2022-12-04 03:03:00  19.311873  19.024235  21.000000  21.061362  20.000000  21.123863
257             1670119440 2022-12-04 03:04:00  19.311873  19.017045  21.000000  21.061362  20.062500  21.123863
258             1670119500 2022-12-04 03:05:00  19.311873  19.051306  21.000000  21.061362  20.062500  21.123863
259             1670119560 2022-12-04 03:06:00  19.311873  19.062042  21.003440  21.061362  20.062500  21.123863
260             1670119620 2022-12-04 03:07:00  19.375000  20.250000  21.125000  21.687500  20.875000  21.687500

I would like to plot all columns starting with avg and set data on x axis to datetime. When I set xdata to timestamp (int) everything works fine:

axs.scatter(
    temps["timestamp_group_start"],
    temps[col_name],
)

But when I try to change it to:

axs.scatter(
    temps["date_time"],
    temps[col_name],
)

I get an error like this:

  File "D:\Documents\Projects\project_name\.venv\Lib\site-packages\matplotlib\dates.py", line 359, in _from_ordinalf
    np.timedelta64(int(np.round(x * MUSECONDS_PER_DAY)), 'us'))
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OverflowError: int too big to convert

Type of record in data["date_time"] is <class 'pandas._libs.tslibs.timestamps.Timestamp'> I've read OverflowError: int too big to convert when formatting date on pandas series plot but it doesn't seem to solve my problem.

EDIT: The problem was caused not by plt.xlim(timestamp_start, timestamp_end) statement. After removing this line, eveyrthing works as a charm. I still have to figure out why setting the lims makes it break, but at least the cause is clear.

Thak you for any tips!

zorka5
  • 105
  • 1
  • 9
  • I think timestamp_group_start column is ns not ms, please check the function to valid them. – Xiaomin Wu Jul 30 '23 at 13:17
  • @XiaominWu but i think it's not a case. When I use temps["date_time"] I no longer use temps["timestamp_group_start"] anuwhere. And the datetime i think is correct. I convert the data when creating pandas Dataframe by using self.date_time = datetime.fromtimestamp(self.timestamp_group_start) – zorka5 Jul 30 '23 at 13:21
  • Also I think that my timestamp is simply in seconds - I operate on 60 second intervals and it corresponds to the timestamps you can see – zorka5 Jul 30 '23 at 13:22
  • I get https://stackoverflow.com/questions/66659928/overflowerror-int-too-big-to-convert-when-formatting-date-on-pandas-series-plot – Xiaomin Wu Jul 30 '23 at 13:55
  • @XiaominWu also tried it, but no success – zorka5 Jul 30 '23 at 14:37
  • 1
    can you show you complete code for more details – Xiaomin Wu Jul 30 '23 at 15:26
  • **Сan't recreate the problem on the provided data**. Please, provide [a minimal reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) – Vitalizzare Jul 30 '23 at 20:25
  • I've added an edit to the post with cause of the problem - plt.xlim(timestamp_start, timestamp_end) line – zorka5 Jul 31 '23 at 15:30

1 Answers1

0

First, convert the "date_time" column to datetime dtype using the Pandas library, and then, after converting dtype set that column as an index.

While plotting a plot you are only supposed to give y-axis data-values column.