Compiler issue: AssertionError on replace( ) given T or F condition with string and column cell

Question

I want to check if a column entry matches a city on a list of cities (region), if there is a match, then I want to add to a column a string with the region zipcode (region_name) and if it does not match then I want to keep the current column value.

A review of cases

How to resolve Assertion Error for multiple columns in pandas
AssertionError with pandas when reading excel
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.replace.html and https://www.w3resource.com/pandas/series/series-replace.php It says here, "Raises AssertionError if regex is not a bool and to_replace is not None." I'm not 100% clear what that means.

I tried a new library (modin) and made a few changes (including installing pylint as prompted by a popup) and afterward, replace() no longer worked with a column.

import pandas as pd
df = pd.DataFrame({'city_nm': ['Cupertino', 'Mountain View', 'Palo Alto'],'zip_cd': ['95014', False, '94306']})
region_name = '99999'
region = ['Cupertino', 'Mountain View', 'Palo Alto']

def InferZipcodeFromCityName(df, region, region_name):
    PATTERN_CITY = '|'.join(region)
    foundZipbyCity = ( 
        (df['zip_cd'] == False) &
        (df['cty_nm'].str.contains(PATTERN_CITY, flags=re.IGNORECASE) ) 
        )
    df['zip_cd'] = foundZipbyCity.replace( (True,False), (region_name, df['zip_cd']) )  
    return df

#this is what I want
In[1]: df = InferZipcodeFromCityName(df, region, region_name)
Out[1]: 
   city_nm  zip_cd
0  'Cupertino'  '95014'
1  'Mountain View'  '99999'
2  'Palo Alto'  '94306'

#this is what I get --> AssertionError

try 1: df['zip_cd'] = foundZipbyCity.replace( (True,False), (region_name, df['zip_cd']), regex = False )  #AssertionError
try 2: df['zip_cd'] = foundZipbyCity.replace( (True,False), (region_name, region_name]) ) #changed to (string,string) and works fine, however, it does nothing useful

EDIT: On a second and third laptop, I installed Anaconda and VS Code and it works fine: on this first laptop, I uninstalled anaconda and vs code, and reinstalled with no effect(this laptop worked fine with this code for a year up until I tried the modin library--probably a coincidence but still)

Try setting `regex=False` explicitly in replace e.g. `df['zip_cd'] = replace((True, False), (region_name, df['zip_cd']), regex=False)` — forgetso, Nov 15 '20 at 10:16
Do you have a bigger stack trace than this? It might give details of the specific pandas internal function that went wrong. — forgetso, Nov 15 '20 at 10:25
The fact that you can't even see pandas in the stack trace is suspicious. Maybe pylint is somehow killing the process before pandas is even run. — forgetso, Nov 15 '20 at 10:51

score 1 · Accepted Answer · answered Nov 16 '20 at 10:45

1

The problem is that you're expecting that in this statement all False values will be grabbed from df["zip_cd"]:

df['zip_cd'] = foundZipbyCity.replace( (True, False), (region_name, df['zip_cd']) )

However that's not true, and what's happening here is that we will try to replace False to a Series False -> df["zip_cd"] and pandas seems to fail to replace False scalar to a Series.

What you're probably want to do here is replace all values in df["zip_cd"] that satisfies foundZipbyCity mask to region_name

df["zip_cd"][foundZipbyCity] = region_name

I've run your code with this change and it output the expected result.

answered Nov 16 '20 at 10:45

Dmitry Chigarev

140
1
4

Thank you. I'm not sure what the runtime of this line of code was, however, I used it a dozen times and your version should be 2x faster--I still wonder why my code worked on my 'left side' MS Surface but not on my 'right side' MS Surface: luke the spook at work – forest.peterson Nov 17 '20 at 05:31
this warning is set each time: See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy – forest.peterson Nov 17 '20 at 06:17

Compiler issue: AssertionError on replace( ) given T or F condition with string and column cell

1 Answers1