I am trying to select all rows that have been mangled in our database and contain non-UTF8 characters ... Is this best by a regex?
Currently, I have tried "like '%Ã%'", which works fairly well, but not 100% by a long way. This regex isn't great as it pulls back all of our successfully 'translated back into utf8' characters as well as spaces etc (REGEXP '(\S+[^A-Za-z0-9]+)'"). Although the latter are easy enough to get out, am not sure if regex the best route.
Example rows not being selected included characters such as "dié", "yücel" and "Gråberg".
Thanks