if value in pandas column is not in

Question

I have a df named DF. I need to apply some conditions if values of "Casa" column are not in ('StockCenter', 'Dexter', 'Moov') . I'm trying this:

if Total_unificado[~Total_unificado['Casa'].isin(['StockCenter', 'Dexter', 'Moov'])]:

but I´m getting this error:

 File "c:\Users\User\Desktop\Personal\DABRA\Unificador_Salida_Final.py", line 81, in <module>
    if Total_unificado[~Total_unificado['Casa'].isin(['StockCenter', 'Dexter', 'Moov'])]:      
  File "C:\Users\User\Desktop\Personal\DABRA\Scraper_jfs\venv\lib\site-packages\pandas\core\generic.py", line 1527, in __nonzero__
    raise ValueError(
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

What is wrong in the script??

thanks in advance!

Akanksha Atrey · Answer 1 · 2022-05-10T22:10:38.430

1

The output of Total_unificado[~Total_unificado['Casa'].isin(['StockCenter', 'Dexter', 'Moov'])] is the subset of the dataframe where values of "Casa" are not in the specified list. You cannot apply if on a dataframe, hence the error.

Edit after comments to iterate over subset of the dataframe:

df_subset = Total_unificado[~Total_unificado['Casa'].isin(['StockCenter', 'Dexter', 'Moov'])]

for index, row in df_subset.iterrows():
   apply conditions...

edited May 10 '22 at 22:10

answered May 10 '22 at 20:22

Akanksha Atrey

780
4
8

ohh Oh, thanks for the answer!!! and how should I do if what I need is to apply some conditions, only if the value in Total_unificado['Casa'] row is not 'StockCenter', 'Dexter' or 'Moov', for each row? – Maximiliano Vazquez May 10 '22 at 20:37
You would have to use Pandas iterrows() on the subset dataframe (https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.iterrows.html). However, iterating over pandas rows should be avoided if possible. What are you trying to do exactly to each row? – Akanksha Atrey May 10 '22 at 20:39
I need to apply like 15 of these conditions, to those rows where Marca's Value =! 'StockCenter' , 'Dexter , 'Moov'] : Total_unificado['Sub_Categoria'] = np.where(Total_unificado.Descripcion_Producto.str.contains(r'^(?=.*Remera)(?=.*Pack)', regex=True), 'Remeras', Total_unificado.Sub_Categoria) Total_unificado['Sub_Categoria'] = np.where(Total_unificado.Descripcion_Producto.str.startswith('Top'), 'Tops', Total_unificado.Sub_Categoria) – Maximiliano Vazquez May 10 '22 at 21:45
Got it. I am sure you can do that without iterating over the rows as well. But to answer your initial question, you can iterate over df_subset (see edit) using `for index, row in df_subset.iterrows():`. Check this for further help: https://stackoverflow.com/questions/23330654/update-a-dataframe-in-pandas-while-iterating-row-by-row. – Akanksha Atrey May 10 '22 at 22:08

if value in pandas column is not in

1 Answers1