0

I have 2 DataFrames as shown below:

df1:

OSIED    geometry
257005   POLYGON ((311852.712 178933.993, 312106.023 17...
017049   POLYGON ((272943.107 137755.159, 272647.627 13...
017032   POLYGON ((276637.425 146141.397, 276601.509 14.

df2:

small_area Median_BER
2570059001   212.9
017049002    212.9
217112003    212.9

I need to search for df1.col1 in df2.col2 using "contains" logic and if it matches, get all the columns from both dataframes:

 osied   geometry              small_area   ber
 257005  POLYGON ((311852.71   2570059001   212.9

I am new to python, which function which does this? isin function isn't useful here.

SherylHohman
  • 16,580
  • 17
  • 88
  • 94
Tina
  • 61
  • 3
  • 1
    What do you mean by *"df1.col1 in df2.col2 using "contains" logic"*? How is 123 contained in 124? I don't see how it is "contained", whether we use string, integer or list operations – smci Jan 16 '20 at 21:35
  • Do you need to return all columns only if all columns match, or only return those columns where there is a match across the dataframes? – whege Jan 16 '20 at 21:38
  • What the comments above says, and where did `345` go? – Erfan Jan 16 '20 at 21:38
  • Given your expected result, this appears to be just a concatenation: `pd.concat([df1, df2], axis=1)`. Then rename the columns in the result. – Alexander Jan 16 '20 at 21:42
  • I want to search col1 values from dataframe 1 in col1 values in dataframe 2. They wont match entirey hence I would like to use contains/like function in python. If it matches all values from both frames should be displayed – Tina Jan 16 '20 at 21:43
  • Can you create a most exhaustive test case? – Scott Boston Jan 16 '20 at 21:44
  • Duplicate quaint did us for using SQL like commands, but I believe OP is looking for panda like commands. Also in comments for proposed answer, is a reason the proposed duplicate won't work. – SherylHohman Jan 16 '20 at 23:43

1 Answers1

0

Updated: Try this:

if any(df1.col1.isin(df2.col1)):
     pd.concat([df1, df2], axis=1)

I think what you are probably looking for is some kind of merge. You can do:

df2.merge(df1, left_on='col2', right_on='col1', how='inner')

or change the 'how' argument based on what you're looking for.

whege
  • 1,391
  • 1
  • 5
  • 13
  • merge wont work because not all values match from df1.col1 with df2.col1..meaning to say there is no primary key – Tina Jan 16 '20 at 21:41
  • So are you looking for if any value in the specified columns match, and if so then return all columns from both dataframes? – whege Jan 16 '20 at 21:43
  • Yes.....in sql we could use like operator...how would be go about that in python – Tina Jan 16 '20 at 21:44
  • @Tina see above. Basically this checks to see if there are any matches in the two specified columns, and if so then return a horizontal stack of the two dataframes. – whege Jan 16 '20 at 21:49
  • it doenst work..not sure for what reason it doesnt get into the ifloop – Tina Jan 16 '20 at 21:58
  • I have modified the data to reflect the original dataset.. ifany doesnt work in thi scase – Tina Jan 16 '20 at 22:02
  • 1
    Ok, the updated post shows why that wouldn't work. I'm trying to think what would work beyond a nested loop, but I'm not aware of any built-in functions that would do what you want. – whege Jan 16 '20 at 22:15
  • 1
    Maybe try looking here: https://stackoverflow.com/q/41590139/12229158 – whege Jan 16 '20 at 22:17