1

I'm trying to create a new series within a dataframe that maps a dictionary along two-dimensions, first matching the key, then matching the value within an array . The existing series is a datetime and the key match is against the date and the value match is the hour (thus the new series 'dh')

There is a similar question for mapping a 1-dimensional array here: Adding a new pandas column with mapped value from a dictionary, but that maps the entire array to each day.

Current code:

import pandas as pd

df = pd.DataFrame({'datetime':pd.date_range('1/1/2018','1/4/2018', freq = '1H', closed = 'left')})

day_hour = {1:range(48,0,-2),
            2:range(96,0,-4),
            3:range(120,0,-5) }

df['dh'] = df['datetime'].dt.day.map(day_hour)

Output snippet:

              datetime                                                 dh
0  2018-01-01 00:00:00  [48, 46, 44, 42, 40, 38, 36, 34, 32, 30, 28, 2...
1  2018-01-01 01:00:00  [48, 46, 44, 42, 40, 38, 36, 34, 32, 30, 28, 2...
2  2018-01-01 02:00:00  [48, 46, 44, 42, 40, 38, 36, 34, 32, 30, 28, 2...
3  2018-01-01 03:00:00  [48, 46, 44, 42, 40, 38, 36, 34, 32, 30, 28, 2...
4  2018-01-01 04:00:00  [48, 46, 44, 42, 40, 38, 36, 34, 32, 30, 28, 2...
5  2018-01-01 05:00:00  [48, 46, 44, 42, 40, 38, 36, 34, 32, 30, 28, 2...
6  2018-01-01 06:00:00  [48, 46, 44, 42, 40, 38, 36, 34, 32, 30, 28, 2...
7  2018-01-01 07:00:00  [48, 46, 44, 42, 40, 38, 36, 34, 32, 30, 28, 2...
8  2018-01-01 08:00:00  [48, 46, 44, 42, 40, 38, 36, 34, 32, 30, 28, 2...
9  2018-01-01 09:00:00  [48, 46, 44, 42, 40, 38, 36, 34, 32, 30, 28, 2...
10 2018-01-01 10:00:00  [48, 46, 44, 42, 40, 38, 36, 34, 32, 30, 28, 2...
11 2018-01-01 11:00:00  [48, 46, 44, 42, 40, 38, 36, 34, 32, 30, 28, 2...

Desired Output:

              datetime   dh
0  2018-01-01 00:00:00   48
1  2018-01-01 01:00:00   46
2  2018-01-01 02:00:00   44
3  2018-01-01 03:00:00   42
4  2018-01-01 04:00:00   40
5  2018-01-01 05:00:00   38
...
elPastor
  • 8,435
  • 11
  • 53
  • 81

2 Answers2

2

If you dict is well created , you do not need map

df['dh']=sum(map(list,day_hour.values()),[])

Update

df['dh'] = df['datetime'].dt.day.map(day_hour)
df['new']=df.groupby(df['datetime'].dt.date).cumcount()

df['dh']=df.apply(lambda x : x['dh'][x['new']],axis=1)
BENY
  • 317,841
  • 20
  • 164
  • 234
  • Thanks Wen, but can you clarify what a "well created" dict means? – elPastor Mar 14 '18 at 16:46
  • @pshep123 created base on the order of your dataframe , which mean `df['datetime'].dt.day` return `[1,2,3]`, dict key should be [1,2,3] – BENY Mar 14 '18 at 16:49
  • Then it is exactly that, per the example above. However, now I'm having trouble on the full-code (vs. this example code) and getting a ```Length of values does not match length of index``` error. – elPastor Mar 14 '18 at 16:51
  • @pshep123 do we have both `2018-01-01 00:00:00` and`2018-02-01 00:00:00` in real data ? – BENY Mar 14 '18 at 16:52
  • yes - there would be multiple matches against the key. – elPastor Mar 14 '18 at 16:57
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/166834/discussion-between-pshep123-and-wen). – elPastor Mar 14 '18 at 17:09
  • After a little tweaking (had to groupby two variables in my actual code), this did the trick! Thanks @Wen – elPastor Mar 14 '18 at 17:33
0

I was playing golf with your problem

df.assign(dh=[h[t.hour] for t, h in df.values])

             datetime  dh
0 2018-01-01 00:00:00  48
1 2018-01-01 01:00:00  46
2 2018-01-01 02:00:00  44
3 2018-01-01 03:00:00  42
4 2018-01-01 04:00:00  40
...
piRSquared
  • 285,575
  • 57
  • 475
  • 624