0

I am trying to compare a Pandas Dataframe with a List. I have extracted IDs to a list, called list_x;

Since I have several rows with the same ID, this is reflected on the list. i.e list_x = [1,1,1,1,2,3, etc.]

I am trying to drop all dataframe entries that have an ID that is also in the list

what I have been trying are variations of:

for j in range(len(dataframe)-1):
    if dataframe.loc(j,"ID") in list_x: dataframe.drop([j], inplace = True)

or variations of

for j in range(len(dataframe)-1):
    for k in range(len(list_x)-1):
        if dataframe.loc(j,"ID") in list_x[k]: dataframe.drop([j], inplace = True)

I get an error which I think comes from the fact I am comparing the list's index with the dataframe, and not the actual list entry.

Any help would be appreciated.

David Buck
  • 3,752
  • 35
  • 31
  • 35
  • Does this answer your question? [How to filter Pandas dataframe using 'in' and 'not in' like in SQL](https://stackoverflow.com/questions/19960077/how-to-filter-pandas-dataframe-using-in-and-not-in-like-in-sql) – David Buck Nov 06 '22 at 11:44

1 Answers1

1

You want to get the dataframe without rows associated to IDs in list_x. So you can go for this :

# your df (2 columns : ID and value)
df = pd.DataFrame({'ID': [1,3,5,6,7], 'value' : ['red', 'blue', 'green', 'orange', 'purple']})

# the list of IDs you don't want in your the dataframe
list_x = [1,1,2,3,5]

# the output
df = df[~df.ID.isin(list_x)]
David Buck
  • 3,752
  • 35
  • 31
  • 35
koding_buse
  • 161
  • 3