I need to match two word names like (Johny Cash). So I was using the regex:
L"^[A-Z][a-z]+\\s+[A-Z][a-z]+$"
However, I realized that the names may contain diacritics as well, something like Johannes Fürnkranz. I came across following solution and modified my regex to:
L"^[A-Z].\\p{L}+\\s+[A-Z].\\p{L}+$"
This, assuming that the first letter in each word won't be a diacritic. However, I am getting an invalid regex error. Any idea how to make it work?
Following a suggestion, I tried the solution mentioned here. Problem is, the function test_unicode()
never returns from me, when I am trying here. Not sure what I am doing wrong.