3

I'm having trouble with a script to replace the normal letters to especial characters to test a translation system, here's an example (cha-mate is chá-mate but would be tested with chã-mate/chã-máte and other variations), but instead of creating this variations, it's switching all of the same characters to only one espcial letter, here's what it's printing:

chá-máte
chã-mãte

Here's what should print in theory:

cha-máte
cha-mãte
chá-mate
chã-mate
etc.

Here's the code and the json utilized:

def translation_tester(word):
    esp_chars = {
        'a': 'áã',
    }

    #words = [word]
    for esp_char in esp_chars:
        if esp_char in word:
            replacement_chars = esp_chars[esp_char]
            for i in range(len(replacement_chars)):
                print(word.replace(esp_char, replacement_chars[i]))

def main():
    words = ['cha-mate']
    for word in words:
        translation_tester(word)

main()

Anyway, any help is appreciated, thanks in advance!

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
  • 1
    Yours specs is a bit confusing. Why not cha-mate, cha-máte, cha-mãte too? Why are you only changing the second a and not the first a? If indeed the expected number of results is 9, then it is a basically a permutation. – Spinor8 Mar 13 '19 at 13:30
  • @Spinor8 sorry if it was confusing, tbh the way you presented would work fine, gonna edit the specs now so it's more clear, thanks! –  Mar 13 '19 at 13:45

2 Answers2

0

To handle arbitrary number of replacements, you need to use recursion. This is how I did it.

intword = 'cha-mate'
esp_chars = {'a': 'áã'}

def wpermute(word, i=0):
    for idx, c in enumerate(word[i:], i):
        if c in esp_chars:
            for s in esp_chars[c]:
                newword = word[0:idx] + s + word[idx + 1:]
                wpermute(newword, idx + 1)
        if idx == len(word) -1:
            print(word)

wpermute(intword)

which gives the output of 9 different ways the word can be written.

chá-máte
chá-mãte
chá-mate
chã-máte
chã-mãte
chã-mate
cha-máte
cha-mãte
cha-mate
Spinor8
  • 1,587
  • 4
  • 21
  • 48
  • Thank you so much for this example! It was easy to implement the rest of the stuff on it. –  Mar 13 '19 at 16:59
  • I did some changes so I could try to understand how the code fully works but I'm having doubts about two sections, here's my comments and edits, hope you can give me some explanation or give me some material to read regarding it, again, thank you so much for the example you gave! –  Mar 13 '19 at 22:23
  • What sort of doubts do you have? The idea is that I split up into multiple branches every time I encounter esp_chars. After the first 'a', it splits into 3 branches. After the second 'a', each of the 3 branches, split into another 3. So altogether you have 9. Everytime there is a split, I start the new count at the next index. You can see that the enumerate starts not at zero unless it is right at the beginning. Here is a link on recursion that is quite good but doesn't quite cover the case above. https://realpython.com/python-thinking-recursively/ – Spinor8 Mar 14 '19 at 03:34
  • A good debugger will help you understand how the code works. I personally use PyCharm. Put breakpoints and see how the variables flow. Of course, you should read the above on recursion first. Otherwise the tree-like traversal will confuse you. – Spinor8 Mar 14 '19 at 03:43
  • I just saw some of your attempted edits. Let me try to address them. wpermute(newword, idx + 1) is recursion. The function wpermute calls itself again. Read the above link. if idx == len(word) -1: is for when the iteration reaches the end of the word. In this case, we print it out. – Spinor8 Mar 14 '19 at 03:54
0

There might be a nicer way to do this, but you can do the following (making sure to include the plain 'a' in the list of replacement chars):

import itertools
import re

def replace_at_indices(word, new_chars, indices):
  new_word = word
  for i, index in enumerate(indices):
    new_word = new_word[:index] + new_chars[i] + new_word[index+1:]
  return new_word

def translation_tester(word):
    esp_chars = {
        'a': 'aáã',
    }

    for esp_char in esp_chars:
      replacement_chars = list(esp_chars[esp_char])
      indices = [m.start() for m in re.finditer(esp_char, word)]
      product = list(itertools.product(replacement_chars, repeat=len(indices)))
      for p in product:
        new_word = replace_at_indices(word, p, indices)
        print(new_word)

def main():
    words = ['cha-mate']
    for word in words:
        translation_tester(word)

main()

For your example, this should give you:

cha-mate
cha-máte
cha-mãte
chá-mate
chá-máte
chá-mãte
chã-mate
chã-máte
chã-mãte

See also:

Find all occurrences of a substring in Python

generating permutations with repetitions in python

Replacing a character from a certain index

Mixolydian
  • 215
  • 1
  • 8
  • thanks for the example and the links, there's some pretty good stuff there! –  Mar 13 '19 at 17:02