I am trying to figure out a regular expression to use with the Flex regex engine with C++, so that I can parse a construct from my programming language, where the keywords are in Hebrew. One of the construct/patterns the regex needs to recognize is:
קו
Regex I've tried:
"קו"
(קו)
[\u05E7\u05D5]
[\u05D5]{1}[\u05E7]{1}
[^\b\u05D5][\u05E7\b]
The first one worked but then my other regex pattern recognized it too which I don't want which is:
`[קראטוןםפשדגכעיחלךףזסבהנמצתץ]+`
Also, tried to use unicode for the above pattern which is below - it did not work
[\u05D0-\u05EA]+
Ideally, I want my regex pattern to be able to match the following string combo or the one below it
קו אחד = שלום
קו אחד
For the above, I tried these regex patterns but none worked:
(קו)(\s)[קראטוןםפשדגכעיחלךףזסבהנמצתץ]+
(וק)\s+[קראטוןםפשדגכעיחלךףזסבהנמצתץ]+
[קראטוןםפשדגכעיחלךףזסבהנמצתץ]+\s+(וק)
Ideally, in all my regex expressions, I'd like to use the unicode characters.
Also, this is the table that I've been using for the unicode characters: this link
Moreover, I have looked at these questions and have also tried the posted solutions which nothing worked. I only want to use the unicode system for the Hebrew letters that don't have dots which is only unicode characters u05D0-u05EA and these questions cover the unicode characters with the dot system. Regardless, I can't seem to get replacing the dotted unicode characters with the non-dotted unicode characters to work:
tried all solutions here
read through this, tried solution, no success
and this is for PHP, so not very helpful as I'm using C++