0

From the code given here, I have developed another code which uses Matplotlib in place of Seaborn (The data are plotted on several figures and subplots, and so are now more readable and I am closer to the point I want to reach: the user by putting the cursor over a point has access to all the information of the point, in particular the datetime.)

Here it is:

import pandas as pd
import numpy as np
import matplotlib.dates as mdates
import matplotlib.pyplot as plt
import random

from datetime import datetime

# size of the database
n = 1000

nA = 4
nB = 9

no = np.arange(n)
date = np.random.randint(1e9, size=n).astype('datetime64[s]')
A = [''.join(['A',str(random.randint(1, nA))]) for j in range(n)]
B = [''.join(['B',str(random.randint(1, nB))]) for j in range(n)]
Epsilon1 = np.random.random_sample((n,))
Epsilon2 = np.random.random_sample((n,))
Epsilon3 = np.random.random_sample((n,))

data = pd.DataFrame({'no':no,
                     'Date':date,
                     'A':A,
                     'B':B,
                     'Epsilon1':Epsilon1,
                     'Epsilon2':Epsilon2,
                     'Epsilon3':Epsilon3})


def format_coord(x, y):
    string_x = datetime.utcfromtimestamp(x).strftime("%m/%d/%Y, %H:%M:%S")
    return 'x={}, y={:.4f}'.format(string_x,y)

def plot_Epsilon_matplotlib():
    
    for A in data['A'].sort_values().drop_duplicates().to_list():
        
        n_col = 2
        
        fig, axes = plt.subplots(np.ceil(nB/n_col).astype(int),n_col)
        
        for j, B in enumerate(data['B'].sort_values().drop_duplicates().to_list()):
            
            df = data.loc[(data['A']==A) & (data['B']==B)]
            df = df.sort_values("Date", ascending=True)
            
            axes.flatten()[j].plot(df["Date"],df['Epsilon1'],marker='x',c='b',label="Epsilon1")
            axes.flatten()[j].plot(df["Date"],df['Epsilon2'],marker='x',c='r',label="Epsilon2")
            axes.flatten()[j].plot(df["Date"],df['Epsilon3'],marker='x',c='g',label="Epsilon3")
            
            axes.flatten()[j].format_coord = format_coord

if __name__ == '__main__':

    plot_Epsilon_matplotlib()

The goal is that when the user puts the cursor over a point, he gets access to the full datetime of the data.

  • I have first tried to change the major formatter (as here):

    axes.flatten()[j].xaxis.set_major_formatter(mdates.DateFormatter('%Y/%m/%d %H:%M:%S'))
    

    but then the x ticks are not readable (especially if the user zooms on a subplot)

  • I then tried the define my own format_coord as here. My first try is given in the full code given above. The format of the datetime in Matplotlib figure status bar is good but the date remains in 1970 !

  • After reading this discussion, I realized this problem relates on Numpy datetime64 to Datetime conversion. I then coded this new version of format_coord (strongly inspired from this answer):

    def format_coord_bis(x,y):
        dt64 = np.datetime64(datetime.utcfromtimestamp(x))
        unix_epoch = np.datetime64(0, 's')
        one_second = np.timedelta64(1, 's')
        seconds_since_epoch = (dt64 - unix_epoch) / one_second
        string_x = datetime.utcfromtimestamp(seconds_since_epoch).strftime("%m/%d/%Y, %H:%M:%S")
        return 'x={}, y={:.4f}'.format(string_x,y)
    

    but the date given in the status bar remains the 01/01/1970...

FObersteiner
  • 22,500
  • 8
  • 42
  • 72
Julien M.
  • 35
  • 1
  • 8

1 Answers1

0

I have found the solution from this answer.

The function format_coord() should be defined as follows:

def format_coord(x, y):
    string_x = matplotlib.dates.num2date(x).strftime('%Y-%m-%d %H:%M:%S')
    return 'x={}, y={:.4f}'.format(string_x,y)

enter image description here

Julien M.
  • 35
  • 1
  • 8