3

Suppose I have two DataFrames: intraday which has one row per minute, and daily which has one row per day.

How can I add a column intraday['some_val'] where some_val is taken from the daily['some_val'] row where the intraday.index value (date component) equals the daily.index value (date component)?

feetwet
  • 3,248
  • 7
  • 46
  • 84

1 Answers1

3

Given the following setup,

intraday = pd.DataFrame(index=pd.date_range('2016-01-01', '2016-01-07', freq='T'))
daily = pd.DataFrame(index=pd.date_range('2016-01-01', '2016-01-07', freq='D'))
daily['some_val'] = np.arange(daily.shape[0])

you can create a column from the date component of both indices, and merge on that column

daily['date'] = daily.index.date
intraday['date'] = intraday.index.date    
daily.merge(intraday)
            date  some_val
0     2016-01-01         0
1     2016-01-01         0
2     2016-01-01         0
3     2016-01-01         0
4     2016-01-01         0
...          ...       ...
8636  2016-01-06         5
8637  2016-01-06         5
8638  2016-01-06         5
8639  2016-01-06         5
8640  2016-01-07         6

Alternatively, you can take advantage of automatic index alignment, and use fillna.

intraday['some_val'] = daily['some_val']
intraday.fillna(method='ffill', downcast='infer')
                     some_val
2016-01-01 00:00:00         0
2016-01-01 00:01:00         0
2016-01-01 00:02:00         0
2016-01-01 00:03:00         0
2016-01-01 00:04:00         0
...                       ...
2016-01-06 23:56:00         5
2016-01-06 23:57:00         5
2016-01-06 23:58:00         5
2016-01-06 23:59:00         5
2016-01-07 00:00:00         6

Note that this only works if the time component of your daily index is 00:00.

Igor Raush
  • 15,080
  • 1
  • 34
  • 55
  • How can we `merge` to produce a DataFrame with the same index as the higher-frequency DF (`intraday` in this case)? The first example produces an iterative index and appears to discard the `intraday` index. Note also: The second example works if and only if at least one exact same DateTime occurs in the lower frequency of both DataFrames. E.g., if `intraday` doesn't contain `00:00` each day it will fail even though the `daily` index does. – feetwet Dec 21 '16 at 22:45
  • 1
    In this case you can just use `daily.merge(intraday).set_index(intraday.index)`. Seems that you can't ask `merge` to keep the index of one of the data frames (see [this answer](http://stackoverflow.com/questions/11976503/how-to-keep-index-when-using-pandas-merge)). – Igor Raush Dec 21 '16 at 23:44