0

I am plotting time series of temperature, which look like this:

enter image description here

To plot periods of Marine Heat Waves (MHW) I used ax.fill_between for those temperature values that are above the 90th percentile threshold, obtaining those yellow and orange zones.

MY GOAL HERE: The point here is that I need to eliminate those MHW filled zones that last less than 5 days, beacuse these are not considered MHW.

I assumed that I need to create a condition such as:

if temperature > threshold, then add individual consecutive values of x together and check if the sum is > 5, if True, fill the space between them and the threshold.

I am not really getting there creating a working code so any help would be appreciated.

Here is the code to reproduce the plotting:

Dataset that I am working with. (Simplified, only 90th percentile and temperature from 2021 are displayed).

from matplotlib import dates as md
import pandas as pd
import matplotlib as mpl
import plotly.express as px
import matplotlib.patches as mpatches

# Quick plot features
fig, ax=plt.subplots(1, 1, figsize=(20, 7))
plt.rcParams["axes.grid"] = True

# Dates formatting
ax.set_xlim(np.datetime64('1970-01-01'), np.datetime64('1970-12-31'))
ax.xaxis.set_major_formatter(md.DateFormatter('%b'))
ax.margins(x=0)
plt.xticks()

#Plotting threshold (p90) and temperature from 2021
plt.plot(df4['daily_sst_2021'],label='2021', c = 'black', alpha = 0.75)
plt.plot(df4['p90'], label='90th percentile',linestyle='--',c = 'green',alpha = 1)
y1 = df4['daily_sst_2021']
y3 = df4['p90']

# FILLENG THE AREAS WHERE TEMPERATURE EXCEEDS THE P90 THRESHOLD
ax.fill_between(df4.index, y1,y3, where=y1 >= y3, facecolor='orange', interpolate=True, alpha = 0.5)
## HERE IS WHERE PROBABLY I NEED TO IMPLEMENT MY DESIRED CODE  
OlegRuskiy
  • 173
  • 2
  • 12
  • 3
    I guess your question would get more traction if you simplified the example, provided simple dataframe instead of the whole dataset and only posted the code directly relevant to the question. It can get overwhelming to even start on such a detailed question and sometimes not all of the details are really needed to convey the message – Patryk Kowalski Jan 25 '22 at 11:14
  • Thanks for your suggestion @PatrykKowalski , I simplified the question as much as it is possible. – OlegRuskiy Jan 25 '22 at 11:32
  • 1
    I couldn't access the google sheet. In general I propose to create a bool column with your condition `y1>=y3`, then find where you have a long enough streak [e.g. like here](https://stackoverflow.com/questions/18196811/cumsum-reset-at-nan), create a new condition and use this one in `fill_between` – Patryk Kowalski Jan 25 '22 at 11:58

0 Answers0