0

I have a list like this:

list1 = ['4361', '1856', '57586', '79017', '972', '974', '1829', '10787', '85477', '57019', '7431', '53616', '26228', '29085', '5217', '5527']

And then I have two columns of a data frame like this:

print(df['col A'][0:10])
0      6416
1     84665
2        90
3      2624
4      6118
5       375
6       377
7       377
9       351
10      333


print(df['col B'][0:10])
0      2318
1        88
2      2339
3      5371
4      6774
5     23163
6     23647
7     27236
9     10513
10     1600

I want to say 'return only the rows in the data frame, if an item in the list is either in col A or col B of the data frame'.

I could imagine how to do this iteratively, something like this:

for each_item in list1:
    for i,row in df.iterrows():
         if each_item in row['col A']:
               print(row)
         if each_item in row['col B']:
               print (row)

I'm just wondering if there's a neater way to do it where I don't have to continually loop through the dataframe, as both the list and the dataframe are quite big.

I saw this code snippet online, where this would return the rows where df['col A'] equals a value OR df['col B'] equals a value:

print(df[(df["col A"]==1) | (df_train["col A"]==2)]

I'm just unsure how to convert this to pulling out the data if it's in a list. Can someone show me how to perhaps incorporate this kind of idea into my code, or do people think my original code snippet (using .iterrows()) is the best way?

Slowat_Kela
  • 1,377
  • 2
  • 22
  • 60
  • 1
    Does this answer your question: https://stackoverflow.com/questions/18250298/how-to-check-if-a-value-is-in-the-list-in-selection-from-pandas-data-frame? – Dani Mesejo Oct 22 '21 at 09:29

1 Answers1

1

Use isin:

print(df[(df['col A'].isin(list1)) | (df['col B'].isin(list1))])
Muhammad Hassan
  • 4,079
  • 1
  • 13
  • 27