I have dataframe with follow columns, "Street", "State", "Country", "Zip Code"
and some NaN values in rows.
the result of this expression df['Street'].isna().sum()
is 31
, but this df['State'].isna().sum()
is 204
, and by analogy with the rest of data.
How there can be seen number of rows of "Street" column with Nan values are less than others.
I want to iterate all dataframe by state, country and zip code, and if I meet Nan value I will find if there exists another row with the same street and fill instead of Nan value the zip/Country/State value of matching street.
count = 0
for street, city, state, zip_code in zip(df['Street'], df['City'], df['State'], df['Zip Code']):
if df['City'][count].isna():
location = get_matches(street, df['Street'], df['City'])
if location != None:
df['City'][count] = location
if df['State'][count].isna():
location = get_matches(street, df['Street'], df['State'])
if location != None:
df['State'][count] = location
if df['Zip Code'][count].isna():
location = get_matches(street, df['Street'], df['Zip Code'])
if location != None:
df['Zip Code'][count] = location
count = count + 1
def get_matches(adress,df_street, df_location):
for street, location in zip(df_street, df_location):
if street == adress:
return location
This is my code, but it doesen't work properly.