0

Have pandas DF of type:

          state    x    y
0          Alabama  139  -77
1           Alaska -204 -170
2          Arizona -203  -40
etc.

I have a list of state names which I need to remove from this DF. I try to use loop for it, but it does not work and the DF is the same as before. The state names are what they need to be, so they should be found by my condition.

for state in self.guessed_states:
    states_data.drop(states_data[states_data["state"] == state].index, inplace=True)

The output is the same, so nothing changes after this loop. What am I doing wrong here?

vdmclcv
  • 135
  • 12
  • 2
    states_data = states_data.query(expr="state not in @self.guessed_states") – Jason Baker Sep 02 '23 at 16:41
  • Try: `states_data = states_data[~states_data["state"].isin(self.guessed_states)]` – not_speshal Sep 02 '23 at 17:19
  • several variable names are left for us to guess what they are. You could work your question closer to an actual [MRVE](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – OCa Sep 02 '23 at 21:08
  • Does this answer your question? [How to drop a list of rows from Pandas dataframe?](https://stackoverflow.com/questions/14661701/how-to-drop-a-list-of-rows-from-pandas-dataframe) – OCa Sep 02 '23 at 21:36

2 Answers2

0

Drop() or filter in dataframe can be used like:


import pandas as pd

del_ = ['Drop1','Drop2']

data = {'state':['Alabama','Alaska','Drop2','Arizona','Drop1'], 'x':[139,-204,-100,-203,'--100']}

df = pd.DataFrame(data)
print(df)

# Filter to searched row in del_ list
df3 = df[df['state'].isin(del_)]
print(df3)

# Filter to searched row not in del_ list
df4 = df[~df['state'].isin(del_)]
print(df4)


# drop() row by cell value
for i, row in df.iterrows():
    if row['state'] in del_:
        df = df.drop(i)       
print(df)

Output:

     state      x
0  Alabama    139
1   Alaska   -204
2    Drop2   -100
3  Arizona   -203
4    Drop1  --100

   state      x
2  Drop2   -100
4  Drop1  --100

     state     x
0  Alabama   139
1   Alaska  -204
3  Arizona  -203

     state     x
0  Alabama   139
1   Alaska  -204
3  Arizona  -203

Other drop solutions are:

# Methods - Using the drop() function
# remove from existing df
df.drop(df.index[df['state'] == 'Drop1'], inplace=True)

# remove and show in new df
df2 = df.drop(df[df['state'] == 'Drop2'].index, axis=0)
print(df2)
Hermann12
  • 1,709
  • 2
  • 5
  • 14
  • This is a curious way to use drop. In a loop when the function readily accepts multiple rows, and... do I see deleting elements from the object you are iterating on? – OCa Sep 02 '23 at 21:23
  • @OCa I don’t know, if I understood your request right. Edited my first solution and add further functions. Hope this helps. – Hermann12 Sep 02 '23 at 22:46
-1

To begin with, what you might be doing wrong:

Your loop actually works as you would expect on my side, therefore my guess is that your self.guessed_states is an empty list or does not contain the states names exactly as they appear in the input dataframe. It could be different case a/A? or trailing spaces?

Now, your code could be quite simpler because really there is absolutely no need for a for loop here.

1. Using drop

If you would like to use index tools such as drop, you could just as easily set the states as index:

df = states_data.set_index('state')
           x    y
state            
Alabama  139  -77
Alaska  -204 -170
Arizona -203  -40

Then drop is straightforward, since it readily accepts lists:

drop_list = ['Alabama','Arizona']

df.drop(drop_list, axis=0, inplace=True)
          x    y
state           
Alaska -204 -170

You may restore the dataframe's original shape any time:

df.reset_index(drop=False)
    state    x    y
0  Alaska -204 -170

2. Alternatively, if you wish to keep the dataframe structure at all times, as not_speshal also suggested, you may also directly use the one-liner below, which will return the same output:

states_data[~states_data['state'].isin(drop_list)]
OCa
  • 298
  • 2
  • 13