0

I have two dataframes (df and dbdf) and I want to check all of if any of the strings in df's first column are contained within dbdf's first column.

df
    Parameter   Result
0   Bifenthrin  0.076
1   Cyantraniliprole    0.500
2   DDT (Sum)   0.011
3   Flonicamid (SP)     0.145
4   Imidacloprid (SP)   0.022
5   Indoxacarb  1.450
6   Sulfoxaflor     0.079
7   Tebuconazole    0.057

dbdf[1400:1500].sample(5)

    Pesticide Name  ADI
357     1-Naphthylacetamide     0.1
738     Dichlorprop-P   0.06
965     Folpet  0.1
469     Benomyl     0.1
1227    Orthosulfamuron     0.05

for c in df["Parameter"]:
    dbdf[dbdf['Pesticide Name'].str.contains(c)]

I expect this last two lines to check if any of each of the strings exist in any of the strings contained in the "Pesticide Name" column of dbdf, but it doesn't. Instead I get an error:

"/anaconda3/lib/python3.6/site-packages/ipykernel_launcher.py:3: UserWarning: This pattern has match groups. To actually get the groups, use str.extract. This is separate from the ipykernel package so we can avoid doing imports until"

Why am I getting this error and how can I better write my code?

Also, as some of the strings won't totally match, should I be using "in" as an operator? Or something like get_close_matches() ?

Thanks in advance.

Westworld
  • 190
  • 1
  • 2
  • 14
  • 2
    `dbdf[dbdf['Pesticide Name'].str.contains('|'.join(df.Parameter))]` without the loop. As for the "close" matches, that's a more difficult problem. – ALollz May 02 '19 at 21:03
  • I will try my best to understand what you've done there. Appreciate it – Westworld May 02 '19 at 21:08
  • Have you considered using a `merge` instead? – PMende May 02 '19 at 21:08
  • 1
    This post would have a bit more information on the `'|'` in regex: https://stackoverflow.com/questions/26577516/how-to-test-if-a-string-contains-one-of-the-substrings-in-a-list. Also https://stackoverflow.com/questions/48541444/pandas-filtering-for-multiple-substrings-in-series and for the fuzzy merging see https://stackoverflow.com/questions/13636848/is-it-possible-to-do-fuzzy-match-merge-with-python-pandas – ALollz May 02 '19 at 21:09
  • So merge would create a new dataframe, right? That's a good idea! – Westworld May 02 '19 at 21:09

0 Answers0