I have the following dataset containing film titles and the corresponding genre, while another dataset contains plain text where these titles might be quoted or not:
dt1
title genre
Secret in Their Eyes Dramas
V for Vendetta Action & Adventure
Bottersnikes & Gumbles Kids' TV
... ...
and
dt2
id Text
1. "I really liked V for Vendetta"
2 "Bottersnikes & Gumbles was a great film .... "
3. " In any case, in my opinion bottersnikes &gumbles was a great film ..."
4 "@thewitcher was an interesting series
5 "Secret in Their Eye is a terrible film! but I Like V per Vendetta"
... etc
what I want to obtain is a function that matched those titles in dt1 and tries to find them in the text in dt2:
if it finds any match or approximate match I want to have a column in dt2 that tells with the title that was mentioned in the text. if more than one is mentioned I want a any titles separated by a comma.
dt2
id Text mentions
1. "I really liked V for Vendetta" "V for Vendetta"
2 "Bottersnikes & Gumbles was a great film .... " "Bottersnikes & Gumbles"
3. " In any case, in my opinion bottersnikes &gumbles was a great film ..." "Bottersnikes & Gumbles"
4 "@thewitcher was an interesting series NA
5 "Secret in Their Eye is a terrible film! but I Like V per Vendetta" "Secret in Their Eyes, V for Vendetta"
... etc