-1

I have a data frame that is obtained after grouping an initial data frame by the 'hour' and 'site' column. So the current data frame has details of 'value' grouped per 'hour' and 'site'. What I want is to fill the hour which has no 'value' with zero. 'Hour' range is from 0-23. how can I do this?

Left is input, right is expected output

enter image description here

phoenix
  • 328
  • 2
  • 12

2 Answers2

1

You can try this:

import numpy as np
import pandas as pd

raw_df = pd.DataFrame(
    {
        "Hour": [1, 2, 4, 12, 0, 2, 7, 13],
        "Site": ["x", "x", "x", "x", "y", "y", "y", "y"],
        "Value": [1, 1, 1, 1, 1, 1, 1, 1],
    }
)
full_hour = pd.DataFrame(
    {
        "Hour": np.concatenate(
            [range(24) for site_name in raw_df["Site"].unique()]
        ),
        "Site": np.concatenate(
            [[site_name] * 24 for site_name in raw_df["Site"].unique()]
        ),
    }
)
result = full_hour.merge(raw_df, on=["Hour", "Site"], how="left").fillna(0)

Then you can get what you want. But I suggest you copy your test data in your question instead an image. You know, we have no responsibility to create your data. You should think more about how can make others answer your question comfortably.

Xu Qiushi
  • 1,111
  • 1
  • 5
  • 10
  • i want to paste the dataframe, but i don't know how to.. if I paste directly, it will list everything in a single line. so it will be an extra work for readers and they tend to ignore it. so I pasted the picture – phoenix Jun 17 '20 at 10:24
  • You can write raw data part like what I did in my code under your question. Then others can copy the code and get that df comfortably. – Xu Qiushi Jun 17 '20 at 10:28
1

So if you want to change the value in hours column to zero, where the value is not in range of 0-23, here is what to do.I actually didn't get your question clearly so i assume this must be what you want.I have taken a dummy example as you have not provided you own data.

import pandas as pd
import numpy as np
df = pd.DataFrame({'Date':['10/2/2011', '11/2/2011', '12/2/2011', 
'13/2/2011','14/2/2011'], 
               'Product':['Umbrella', 'Matress', 'Badminton', 'Shuttle','ewf'], 
               'Last_Price':[1200, 1500, 1600, 352,'ee'], 
               'Updated_Price':[12, 24, 0, 1,np.nan],
               'Discount':[10, 10, 10, 10, 11]}) 
df['Updated_Price'] = df['Updated_Price'].fillna(0)
df.loc[df['Updated_Price']>23,'Updated_Price']=0

This replaces all nan values with 0 and and for values greater than 23, also replaces with 0

maria_g
  • 130
  • 1
  • 6