4

I need to set different font to a label based on whether I'm displaying simple English string or characters from other languages that are not based on the latin characterset. So I just want to know whether if the entire string is all latin characters? How can I do that in Swift? I have read this question, but I don't think I can apply that answer because there is no way I can specify all the latin character including the mark and punctuation, one by one, to be excluded in the detection.

Please help. Thanks.

Martin R
  • 529,903
  • 94
  • 1,240
  • 1,382
Chen Li Yong
  • 5,459
  • 8
  • 58
  • 124
  • Why do you think, you can't specify all latin characters? If you want to exclude the English letters only, the collection will be fairly small, less than 60 objects (i.e Uppercase and lowercase letters only). – Olter Jul 12 '17 at 08:25
  • but how about the dot? comma? semicolon? ampersand? and all the symbol that my keyboard does not list? and the accentuated letters (ex: á) is still considered latin because it still can be rendered correctly with the font. – Chen Li Yong Jul 12 '17 at 08:32
  • What characters are you looking for exactly? Do you consider á an ordinary character or not? What about Œ? Are you looking for "characters used in the english language" or for some "latin character set" or for "characters which your font can represent"? – Martin R Jul 12 '17 at 08:34
  • Note that Unicode is a standard for encoding *all* possible characters, so asking if a string "contains at least just one Unicode character" makes no sense. – Martin R Jul 12 '17 at 08:37
  • yes, I consider it an ordinary (or latin-based) character in my case. The string I need to print sometimes can have chinese or burmese or japanese or anything else, and when the string contains characters like chinese or burmese or japanese or other language that isn't based on latin character, I need to use different font. But if it's not possible, then I think I can group the á into non-ordinary character case. – Chen Li Yong Jul 12 '17 at 08:38
  • probably what I'm looking for is is there any way I can define a range of character that can cover the entire latin-based characters (so I can then invert it and get what I need) without the need to specify them one by one. – Chen Li Yong Jul 12 '17 at 08:39
  • From here I can see that there's a possibility to specify a range of set characters. I just don't know how to modify the answer code to suit my need, specifying the set of latin characters. https://stackoverflow.com/questions/31244367/how-can-i-check-if-a-string-contains-chinese-in-swift – Chen Li Yong Jul 12 '17 at 08:41
  • @ChenLiYong: Did you try https://stackoverflow.com/a/31245380/1187415 with `p{Latin}` instead of `p{Han}` ? – Martin R Jul 12 '17 at 08:43
  • oh okay, I don't know I can change it like that. I'll try it right away. Also, I noticed that the answer on that page is also your answer. :D – Chen Li Yong Jul 12 '17 at 08:46
  • Well... Technically you can specify a range of "unichar" objects. I.e. you can set an object to unichar(45), for example, and it will return a specific object from unicode ("-" in this case.) Therefore theoretically you can create a list of objects you want to exclude (Check the unicode symbols list, latin characters are all going in a row there). But that's rather complicated solution. – Olter Jul 12 '17 at 08:46
  • @MartinR btw I noticed that that is to detect if there's _any_ latin character. What is the regex to detect if _the whole_ string only have latin characters? – Chen Li Yong Jul 12 '17 at 08:54
  • @Olter yes, I just weighing upon many alternatives to solve this, and I don't know if specifying ranges will be complicated for this case. – Chen Li Yong Jul 12 '17 at 08:55

1 Answers1

10

Similarly as in How can I check if a string contains Chinese in Swift?, you can use a regular expression to check if there is no character not in the "Latin" character class:

extension String {
    var latinCharactersOnly: Bool {
        return self.range(of: "\\P{Latin}", options: .regularExpression) == nil
    }
}

\P{Latin} (with capital "P") is the pattern matching any character not having the "Latin" Unicode character property.

Martin R
  • 529,903
  • 94
  • 1,240
  • 1,382