I have time series data that I have broken into monthly means and maximums for several years. What I'm trying to figure out is how to choose only specific month values for that data, in my case May-October. I know this probably could be done with a loop (e.g. an if statement or using find) but I'm hoping to find a more efficient method of splitting the data.
Here's what I have so far:
#Import packages and assign to variables
import pandas as pd
import csv
from matplotlib import pyplot as plt
import os
#Change working directory to where file is located
cwd = os.getcwd()
os.chdir("C:/Users/zrr81/Downloads/Climate Dev/Python/Synoptic Client Data")
#Read in file
data = pd.read_csv('KCDC.2019-11-01.csv', parse_dates = ['Date_Time'], index_col = ['Date_Time'])
#Skip header rows
data = data.iloc[1:]
#Create tables with monthly mean & max wind speeds
wind = pd.DataFrame(data, columns = ['wind_speed'])
wind.dropna(how = 'any', inplace = True)
wind['wind_speed'] = wind['wind_speed'].astype(str).astype(float)
wind_m = wind.resample('M').mean()
wind_max = wind.resample('M').max()
Here's a snippet of my output I'm working with as well:
2016-01-31 12.35
2016-02-29 19.55
2016-03-31 19.03
2016-04-30 16.98
2016-05-31 15.95
2016-06-30 16.46
2016-07-31 14.40
2016-08-31 13.89