-1

The data given is grouped by "latitude", it looks like this:

import pandas as pd

input_df = pd.DataFrame({"latitude": ["60", "", "", "", "70", "","80","","","","","85"],
                         "longitude": [50, 51, 52, 53, 50, 51,50,51,52,53,54,50],
                         "pollution": [10,20,30,40,50,60,70,80,90,100,110,120]
                        })

input_df

# see the input_df
    latitude    longitude   pollution
0   60          50          10
1               51          20
2               52          30
3               53          40
4   70          50          50
5               51          60
6   80          50          70
7               51          80
8               52          90
9               53          100
10              54          110
11  85          50          120

Is there a way to fill the gaps in "latitude" columns? The desired output is:

    latitude    longitude   pollution
0   60          50          10
1   60          51          20
2   60          52          30
3   60          53          40
4   70          50          50
5   70          51          60
6   80          50          70
7   80          51          80
8   80          52          90
9   80          53          100
10  80          54          110
11  85          50          120

Many thanks!

Jeremy
  • 849
  • 6
  • 15
  • StackOverflow is not a design, coding, research, or tutorial service. https://stackoverflow.com/help/how-to-ask – NotZack Jul 27 '20 at 14:39
  • Thanks for reminding me. But excuse me, I have searched in different ways before posting this. I really do not know how to solve this. And I feel that others may also encounter this problem, so it is relevant for community. – Jeremy Jul 27 '20 at 14:46
  • Replace empty strings https://stackoverflow.com/q/40711900/6692898 and fillna https://stackoverflow.com/q/38134012/6692898 – RichieV Jul 27 '20 at 14:47
  • The user guide has a dedicated section for missing data https://pandas.pydata.org/docs/user_guide/missing_data.html – RichieV Jul 27 '20 at 14:54

1 Answers1

2

you can change the "" elements with nan and then fill the nan values. in your case:

df["latitude"]=df['latitude'].replace("",np.nan)
for i in range(len(df)):
if df["latitude"].isna().iloc[i]==True:
    df['latitude'].iloc[i] =  df['latitude'].iloc[i-1]