I have a list of city names and a df with city, state, and zipcode columns. Some of the zipcodes are missing. When a zipcode is missing, I want to use a generic zipcode based on the city. For example, the city is San Jose so the zipcode should be a generic 'SJ_zipcode'.
pattern_city = '|'.join(cities) #works
foundit = ( (df['cty_nm'].str.contains(pattern_city, flags=re.IGNORECASE)) & (df['zip_cd']==0) & (df['st_cd'].str.match('CA') ) ) #works--is this foundit a df?
df['zip_cd'] = foundit.replace( 'SJ_zipcode' ) #nope, error
Error: "Invalid dtype for pad_1d [bool]"
Implemented with where
df['zip_cd'].where( (df['cty_nm'].str.contains(pattern_city, flags=re.IGNORECASE)) & (df['zip_cd']==0) & (df['st_cd'].str.match('CA') ), "SJ_Zipcode", inplace = True) #nope, empty set; all set to nan?
Implemented with loc
df['zip_cd'].loc[ (df['cty_nm'].str.contains(pattern_city, flags=re.IGNORECASE)) & (df['zip_cd']==0) & (df['st_cd'].str.match('CA') ) ] = "SJ_Zipcode"
Some possible solutions that did not work
df.loc[df['First Season'] > 1990, 'First Season'] = 1
which I used asdf.loc[foundit, 'zip_cd'] = 'SJ_zipcode'
Pandas DataFrame: replace all values in a column, based on condition and similar/same as Conditional Replace Pandasdf['c'] = df.apply( lambda row: row['a']*row['b'] if np.isnan(row['c']) else row['c'], axis=1)
however, I am not multiplying values https://datascience.stackexchange.com/questions/17769/how-to-fill-missing-value-based-on-other-columns-in-pandas-dataframe- I tried a solution using
where
, however, it seemed to replace the values where the condition was not met with nan--but the nan value was not helpful https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.where.html - This conditional approach looked promising but then without looping with each value I was confused by how does anything happen... What should replace comparisons with False in python?
- An example using
replace
which does not have the multiple conditions and pattern Replacing few values in a pandas dataframe column with another value
An additional 'want'; I want to update a dataframe with values, I do not want to create a new dataframe.