Long time user and first time poster here.
I'm new to pandas and python. I'm working with a data frame that contains either a single value, or a string of values separated by a semicolon; the second data frame is a list of subcontractors and the company that oversees the subcontractor. I'm trying to figure out how to merge these data frames based on the two columns that could contain the subcontractors name. In order to cut down on a lot of the steps I was trying with the pandas library, I've been trying pandasql.
I've tried
on_resource_name = '''
SELECT
w.*, v.[Company Name], v.[Oversite Company] as [Survey Submitter Oversite Co]
FROM
wo_sat w
LEFT JOIN vendors v ON
v.[Company Name] LIKE '%'||w.[Embedded Data - Resource Name]||'%' OR
v.[Company Name] LIKE '%'||w.[Embedded Data - Source Company]||'%' OR
v.[Company Name] LIKE w.[Embedded Data - Resource Name]||'%' OR
v.[Company Name] LIKE w.[Embedded Data - Source Company]||'%'
;
'''
For some reason, the wild cards seem to be ignored and only exact matches are being returned. For example, if a cell contains only (SODEXO)
, Sodexo is returned, but on a cell that contains, (SODEXO; Jerry Thompson; Bill Murray)
no value is matched.
Any help would be greatly appreciated!