I have 2 dataframes df1
and df2
.
Suppose there is a location
column in df1
which may contain a regular URL or a URL with a wildcard, e.g.:
- stackoverflow.com/questions/*
- *.cnn.com
- cnn.com/*/politics
The seconds dataframe df2
has url
field which may contain only valid URLs without wildcards.
I need to join these two dataframes, something like df1.join(df2, $"location" matches $"url")
if there was magic matches
operator in join conditions.
After some googling I still don't see a way how to achieve this. How would you approach solving such problem?