1

The first 20 observations in my data are:

    id                                          day         hour    consumption
0   012af199245dedacf9ea0ba6eedef4e89272c7dc    Saturday    8       0.000000
1   019ebd48fe9c9ab20051e9de1d5ddfc6fd13c55b    Tuesday     16      0.000000
2   0310daaa6368cf0618f341351b8451e509da27d7    Wednesday   17      0.000000
3   04a2ddb034ff774cda02130fd59b280d55f762d7    Tuesday     16     -0.017699
4   04d61391eeea5b957847dbe08b52d88e64909dbf    Thursday    15      0.000000
5   04f1fa8b29c58e19eebf0e26169975a66ec7cbbf    Tuesday     15      0.000000
6   0561aa699b6c91c842b850c6b73ee4b3c8cbb03b    Thursday    12     -0.002597
7   059492a3600ef0b39726af2201a0ad87610a4a02    Thursday    17      0.000000
8   059fb9175372802b43b3fdcebd2a507bc89e71b0    Thursday    12     -0.001541
9   05da142ebe95e15ab30dee30d1a982d8f419dfb2    Tuesday     20     -0.003050
10  0663c2fd03deecf7f52c3e5c7c0be5c94a3292b8    Sunday      13     -0.005613
11  07040b85d9c0c0ff122b3fef3ab73eab6c53ff0e    Saturday    18      0.000000
12  07a33356cb6330b2090152d30413b224ad1c018b    Saturday    20      0.005013
13  07d67b08fab92657c699dbeec931a48c9f1cfbf7    Friday      15     -0.015675
14  07f92e8eb78f9d8ab6446ffd2649990cffce2ead    Friday      16     -0.004035
15  086cfca739da633d89100874a6c91c37e04880af    Friday      0      -0.004068
16  0a64e559b80b819b2a48a939fa96b1f3f3791e54    Monday      12     -0.007687
17  0b477ac123374072c5acf34d1d063d6ae6c4bf0b    Friday      21      0.000000
18  0bf144e77495b06fb319f4a312f09015da7c5afd    Tuesday     4       0.000000
19  0d1263d90f5a5449a1d0eb80c0f217daff646d36    Saturday    8      -0.005963

I am trying to create a heatmap by doing:

sns.heatmap(df.pivot("day", "hour", "consumption"))

But I am getting the error:

ValueError: Index contains duplicate entries, cannot reshape

I attempted using pivot_table() instead which according to the documentation accounts for this error. But then I get:

DataError: No numeric types to aggregate

Alternatively, plotting the number of the day of the week instead works:

# Convert dates into the number of the day of the week
# 0=Mon; 6=Sunday
df['day_num'] = df['timestamp'].dt.weekday

sns.heatmap(df.pivot_table(index="day_num", columns="hour", values="consumption"))

However, I'd like to keep the names (or abbreviations) in the plot.

How can I fix this?

JohanC
  • 71,591
  • 8
  • 33
  • 66
Joehat
  • 979
  • 1
  • 9
  • 36

1 Answers1

1

Note that the params of pivot and pivot_table are ordered differently, so if you don't name the params, the orders need to be changed accordingly:

  • pivot

    # pivot(index=None, columns=None, values=None)
    df.pivot('day', 'hour', 'consumption')
    
  • pivot_table

    # pivot_table(values=None, index=None, columns=None, ...)
    df.pivot_table('consumption', 'day', 'hour')
    

To avoid ambiguity, I suggest using named params:

sns.heatmap(df.pivot_table(index='day', columns='hour', values='consumption'))

pivot_table heatmap

tdy
  • 36,675
  • 19
  • 86
  • 83