I have two DataFrames that I want to join based upon a string column but one of them encodes its names with very shoddy unicode support that drops accents and other diacritics:
fips = DataFrame(muni=["Adjuntas", "Anasco", "Bayamon", "Mayaguez"], fips=[72001, 72011, 72021, 72097])
pops = DataFrame(muni=["Adjuntas", "Añasco", "Bayamón", "Mayagüez"], pop=[17363, 26161, 169269, 71530])
I want to have leftjoin(pops, fips; on=:muni)
use an approximate equality when joining that handles missing accents and diacritics (but ensures the base character is the same), or, failing that, some sort of ascii-ification string transform on pops
.