7

I am attempting to build 3 separate columns in my dataframe for the value of the time stamp HOUR, DAY, MONTH with the DatetimeIndex.

I appologize for data that cant be reproduced because my data set is being read from a CSV File.

boilerDf = pd.read_csv('C:\\Users\\Python Scripts\\Deltadata.csv', index_col='Date', parse_dates=True)

print(boilerDf.info())

This returns:

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 23797 entries, 2017-10-25 05:00:08.436000 to 2018-01-02 05:45:14.419000
Data columns (total 3 columns):
hwr    23797 non-null float64
hws    23797 non-null float64
oat    23797 non-null float64
dtypes: float64(3)
memory usage: 743.7 KB
None

I can see on the pandas.pydata.org website their is 3 methods for what I am trying to do except I want to create separate dataframe (columns):

DatetimeIndex.month 
DatetimeIndex.day   
DatetimeIndex.hour  

This code below does not work on adding a seperate dataframe column for the hour of the date time index... Any ideas?

boilerDf['Hour'] = boilerDf.DatetimeIndex.hour

Kind regards

I also have the data uploaded here on Github: bbartling/Data on Github

bbartling
  • 3,288
  • 9
  • 43
  • 88
  • 1
    Can you post some sample data from your file? We can't run any of your code without your data, or sample data that approximates it. – Evan Feb 02 '18 at 23:44
  • I added a link to a Github account that I created. Its the same CSV file – bbartling Feb 05 '18 at 15:59

1 Answers1

9

I initially suggested .index.strftime() for this answer. However, Henry has also found jezrael's Pandas time series data Index from a string to float which returns column of type integer. I have therefore included an extended version of the latter here. There is a small difference in the output when using the two different methods.

from numpy.random import randint
import pandas as pd

# Create a df with a date-time index with data every 6 hours
rng = pd.date_range('1/5/2018 00:00', periods=5, freq='6H')
df = pd.DataFrame({'Random_Number':randint(1, 10, 5)}, index=rng)

# Getting different time information in columns of type object
df['year'] = df.index.strftime('%Y')
df['month'] = df.index.strftime('%b')
df['date'] = df.index.strftime('%d')
df['hour'] = df.index.strftime('%H')
df['Day_of_week'] = df.index.strftime('%a')

# Getting different time information in columns of type integer
df['year'] = df.index.year
df['month'] = df.index.month
df['date'] = df.index.day
df['hour'] = df.index.hour
df['Day_of_week'] = df.index.dayofweek

df.head()
                     Random_Number  year month date hour Day_of_week
date                                                                
2018-01-05 00:00:00              8  2018   Jan   05   00         Fri
2018-01-05 06:00:00              8  2018   Jan   05   06         Fri
2018-01-05 12:00:00              1  2018   Jan   05   12         Fri
2018-01-05 18:00:00              4  2018   Jan   05   18         Fri
2018-01-06 00:00:00              7  2018   Jan   06   00         Sat

                     Random_Number  year  month  date  hour  Day_of_week
2018-01-05 00:00:00              3  2018      1     5     0            4
2018-01-05 06:00:00              1  2018      1     5     6            4
2018-01-05 12:00:00              9  2018      1     5    12            4
2018-01-05 18:00:00              5  2018      1     5    18            4
2018-01-06 00:00:00              8  2018      1     6     0            5
KRKirov
  • 3,854
  • 2
  • 16
  • 20
  • This is working... Is there a work around for `df['Day_of_week']` to be a numerical value of the day of the week? IE Sunday = 0, Monday = 1, Tuesday = 3, etc... Thank you – bbartling Feb 05 '18 at 16:03
  • Also can the `df['month']` be a numerical value? I also edited the post to a Github account that I created for the CSV data if needed. Thanks for your time. – bbartling Feb 05 '18 at 16:10
  • A string to a float :) – bbartling Feb 05 '18 at 16:22
  • This is answered here https://stackoverflow.com/questions/48627109/pandas-time-series-data-index-from-a-string-to-float?noredirect=1#comment84251116_48627109 – bbartling Feb 05 '18 at 16:40
  • Well done for getting the input from jezrael. I have modified this answer to demonstrate the small differences in the output produced by the two methods. – KRKirov Feb 05 '18 at 17:48
  • @ KRKirov: Could you help me this SO post? Very similar... Thank you https://stackoverflow.com/questions/52766438/python-pandas-date-time-index-create-dataframe – bbartling Oct 11 '18 at 18:07