You can choose particular Unicode blocks explicitly by providing a range of characters:
[-a-zA-ZÀ-ÿĀ-ſƀ-ɏ ']+
It covers characters
, -
, '
. Which could be a part of the name in different cultures.
Plus these Unicode blocks:
If it's not required, you can remove the last one. Or you can also add Latin Extended-C
, Latin Extended-D
, and Latin Extended-E
.
Latin script in Unicode
If you need, you can add Cyrillic also:
Don't forget to test your Regex. Even better is to write automated tests against some name list.
You can check out the regex and play with it there: https://regex101.com/r/ZfEE98/2