1

I'm facing an issue when plotting a finance data series on mathplot.

Basically, my dataframe contain a datetime index and 3 columns of data (market price: "Open", "Close" and "PRC_RL").

The data in DATETIME index contain only working days.

The issue occurs when plotting the series based on the datetime index, the plots shows a gap during the weekends. How can I fix it?

gaps on the weekends enter image description here

  • 4
    This is discussed in the matplotlib documentation: https://matplotlib.org/2.2.2/gallery/ticks_and_spines/date_index_formatter.html – Derek O Apr 01 '21 at 19:14
  • 3
    Please don't post images of code, data, or Tracebacks. Copy and paste it as text then format it as code (select it and type `ctrl-k`) ... [Discourage screenshots of code and/or errors](https://meta.stackoverflow.com/questions/303812/discourage-screenshots-of-code-and-or-errors). [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). – wwii Apr 01 '21 at 19:19

1 Answers1

0

The problem is that the data is not equidistant anymore, which raises problems when aggregating the labels on the x-axis. Nonetheless, you can do this anyway by slicing the data:

from matplotlib import pyplot as plt
import numpy as np
import pandas as pd
# create dummy data
dates = pd.date_range(start='24/4/2020', end='24/5/2020', freq='D')
val = np.random.rand(len(dates))
df = pd.DataFrame()
df['date'] = dates
df['value'] = val

Now plotting the dummy data with

df.plot(x='date',y='value')

results in: enter image description here

One can exclude weekends by creating a logical vector lg, indicating non-weekend-days:

lg = []
for day in df['date']:
    day_ISO = day.isoweekday()
    if day_ISO == 6 or day_ISO == 7: # check for saturday & sunday
        lg.append( False )
    else:
        lg.append( True )

and plot the data again but sliced with this logical vector:

df[lg].plot(x='date',y='value')

enter image description here

You could also easily check for bank holidays in this way.

The dummy data above suggests that you actually leave out data. This is not true if you use finance data, which simply is not generated on weekends. So the line would be still continuous but the x-axis is not clear anymore. So I recommend to adjust the axis e.g. by setting the explicit date to every label or by indicating the chopped days with a small, gray, vertical line in the plot itself... or something similar

Edit: using a pandas.DatetimeInex

from matplotlib import pyplot as plt
import numpy as np
import pandas as pd
# create dummy data
dates = pd.date_range(start='24/4/2020', end='24/5/2020', freq='D')
val = np.random.rand(len(dates))
df = pd.DataFrame()
df['date'] = dates
df['value'] = val
df = df.set_index('date') # create a pandas.DatetimeIndex

lg = []
for day in df.index: # iterate over the indeces
    day_ISO = day.isoweekday()
    if day_ISO == 6 or day_ISO == 7: # check for saturday & sunday
        lg.append( False )
    else:
        lg.append( True )

df[lg].plot(y='value') # no need to set the x-axis explicitly
max
  • 3,915
  • 2
  • 9
  • 25
  • I understood your point. This is certainly a way to solve it. However, the datetime data in the dataframe is the index and so, I would like to find a way to solve it by manipulating the dataframe index directly. – Rodrigo Azevedo Apr 02 '21 at 10:51
  • @RodrigoAzevedo OK, I was too fast in reading. You may want to post an [MVE](https://stackoverflow.com/help/minimal-reproducible-example) next time, so the code fits your exact data. Anyway the approach remains the same, I adjusted the code for you though to work with a `pandas.DatetimeIndex`. – max Apr 02 '21 at 15:11