0

I have time series data that I have broken into monthly means and maximums for several years. What I'm trying to figure out is how to choose only specific month values for that data, in my case May-October. I know this probably could be done with a loop (e.g. an if statement or using find) but I'm hoping to find a more efficient method of splitting the data.

Here's what I have so far:

#Import packages and assign to variables
import pandas as pd
import csv
from matplotlib import pyplot as plt
import os

#Change working directory to where file is located
cwd = os.getcwd()
os.chdir("C:/Users/zrr81/Downloads/Climate Dev/Python/Synoptic Client Data")

#Read in file
data = pd.read_csv('KCDC.2019-11-01.csv', parse_dates = ['Date_Time'], index_col = ['Date_Time'])

#Skip header rows
data = data.iloc[1:]

#Create tables with monthly mean & max wind speeds
wind = pd.DataFrame(data, columns = ['wind_speed'])
wind.dropna(how = 'any', inplace = True)
wind['wind_speed'] = wind['wind_speed'].astype(str).astype(float)
wind_m = wind.resample('M').mean()
wind_max = wind.resample('M').max()

Here's a snippet of my output I'm working with as well:

2016-01-31       12.35
2016-02-29       19.55
2016-03-31       19.03
2016-04-30       16.98
2016-05-31       15.95
2016-06-30       16.46
2016-07-31       14.40
2016-08-31       13.89
Zach Rieck
  • 419
  • 1
  • 4
  • 23

1 Answers1

1
  • Use Pandas: Boolean indexing
  • Pandas: .isin
  • The date column must be a datetime type. Check types with df.info()
  • Convert a column to datetime with df['Date_Time'] = pd.to_datetime(df['Date_Time'])
df[df['Date_Time'].dt.month.isin([5, 6, 7, 8, 9, 10])]

Update from comment

  • If dealing with a datetime index, .dt is not needed
df[df.index.month.isin([5, 6, 7, 8, 9, 10])]
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
  • 1
    I had to modify this slightly since my values are timestamped indices: wind_m = wind_m[wind_m.index.month.isin([5,6,7,8,9,10])] (replace dt with index). However, your answer still led me there, so thank you! – Zach Rieck May 08 '20 at 04:14