-2

This is NOT a duplicate of How to use conditionals when replacing in Notepad++ via regex as I am asking something very specific here which I cannot implement following the info in that question. So kindly allow this question.

I want to replace a range of characters with a corresponding range of characters. So far, I can only do it with multiple operations.

For example, match any word that starts with a capital Latin character in the range [ABEZHIKMNOPTYXZ] and is followed by a Greek lowercase letter [α-ωά-ώ] and replace the character in the first matched group with a similar-looking character but in the Greek range [ΑΒΕΖΗΙΚΜΝΟΡΤΥΧΖ] (note, they look the same but are different characters).

What I came up so far was multiple replacements, ie.

(A)([α-ωά-ώ])
Α\2

(B)([α-ωά-ώ])
Β\2

....

So that for example: Aνθρώπινος would become Ανθρώπινος

Bάτος would become Βάτος

Preferably this should work in EmEditor, Notepad++ being the 2nd option.

greektranslator
  • 499
  • 1
  • 6
  • 19
  • You can write a script to do this in EmEditor. https://gist.github.com/MakotoE/ea3d37515f2006123e32706b0bb024e6 It's important to save this as UTF-8 WITH signature. Go to Macros | Select... and select the macro. Do Macros | Run ___.jsee to use it. – MakotoE Oct 21 '19 at 18:39
  • 1
    @MakotoE, yes, I know about the script possibility, I wonder whether this could be done with a one-liner. **AdrianHHH**, Kindly reread it; the change is to a **different language character**, what is **visually** the same is not necessarily the same on a computational level; Latin A and Greek A are different characters and belong to different character sets. – greektranslator Oct 23 '19 at 07:18

1 Answers1

1

Notepad++ supports conditional replacement, you can use it like:

  • Find what: (?:(A)|(B)|(E)|(Z)|(H)|(I)|(K)|(M)|(N)|(O)|(P)|(T)|(Y)|(X)|(Z))(?=[α-ωά-ώ])
  • Replace with: (?{1}Α:(?{2}Β:(?{3}Ε:(?{4}Ζ:)))) add the other Greek letters similarly

Replacement:

(?:             # start non capture group
(?{1}           # if group 1 exists "A"
  Α             # replace with greek letter
  :             # else
  (?{2}         # if group 2 exists "B"
    Β           # replace with greek letter
    :           # else
    (?{3}       # and so on ...
      Ε
      :
      (?{4}
        Ζ
        :
      )
    )
  )
)
)               # end non capture group
(?=             # positive lookahead, make sure we have after:
    [α-ωά-ώ]    # a small greek letter
)               # end lookahead

I've made a test but for only for 2 letters "A" and "B" and replace them with more visual different letters "X" and "Y" just to show the way it works.

Screen capture (before):

enter image description here

Screen capture (after):

enter image description here

Toto
  • 89,455
  • 62
  • 89
  • 125
  • Thanks, tried with `(A)|(B)` and `(?{1}X:(?{2}F:))` to test and it worked. The point though is that I also need the second capture group in the regex `([α-ωά-ώ])` so that, for example, it will not match words which only have Latins characters, like Accompany or Bear. – greektranslator Oct 24 '19 at 10:19
  • @greektranslator: surround all regex with a non capture group and add a positive lookahead at the end, see my edit – Toto Oct 24 '19 at 10:26
  • I get "invalid regular expression" for the find as you edited it. I also tried `(?:(A|(B)|(E)|(Z)))(?=[α-ωά-ώ])` with `(?{1}Α:(?{2}Β:(?{3}Ε:(?{4}Ζ:))))` but it would always replace matches with Α, irrespective of what the first letter was. – greektranslator Oct 24 '19 at 13:05
  • @greektranslator: Sorry there is a typo, there is a closing parenthesis just after `A`, see my edit – Toto Oct 24 '19 at 13:12