I am trying to perform a left merge using regular expressions in Python that can handle many-to-many relationships. Example:
df1 = pd.DataFrame(['a','b','c','d'], columns = ['col1'])
df1['regex'] = '.*' + df1['col1'] + '.*'
col1 regex
0 a .*a.*
1 b .*b.*
2 c .*c.*
3 d .*d.*
df2 = pd.DataFrame(['ab','a','cd'], columns = ['col2'])
col2
0 ab
1 a
2 cd
# Merge on regex column to col2
out = pd.DataFrame([['a','ab'],['a','a'],['b','ab'],['c','cd'],
['d','cd']],columns = ['col1','col2'])
col1 col2
0 a ab
1 a a
2 b ab
3 c cd
4 d cd