4

I want to replace some characters in a string using a pythonic approach.

A -> T
C -> G
G -> C
T -> A

Example:

AAATCGATTGAT

will transform into

TTTAGCTAACTA

What I did:

def swap(string):
    string = re.sub('A', 'aux', string)
    string = re.sub('T', 'A', string)
    string = re.sub('aux', 'T', string)
    string = re.sub('C', 'aux', string)
    string = re.sub('G', 'C', string)
    string = re.sub('aux', 'G', string)

    return string

It worked great, but i'm looking for a more pythonic way to reach that.

tripleee
  • 175,061
  • 34
  • 275
  • 318
lmalmeida
  • 135
  • 2
  • 14

2 Answers2

5

Use a dictionary with a comprehension and str.join:

translateDict = {
  "A" : "T",
  "C" : "G",
  "G" : "C",
  "T" : "A"
}

s1 = "AAATCGATTGAT"
reconstructed = "".join(translateDict.get(s, s) for s in s1)

Here you have the live example

Note the use of dict.get; in case the letter is not in the dictionary we just let it as it was.

As @bravosierra99 suggests, you can also simply use str.translate:

reconstructed = s1.translate(string.maketrans(translateDict))
Netwave
  • 40,134
  • 6
  • 50
  • 93
  • Thanks for your answer. That's what im looking for <3 – lmalmeida Feb 28 '19 at 04:01
  • if you are going to do translation, then you should do it properly. https://www.programiz.com/python-programming/methods/string/translate – bravosierra99 Feb 28 '19 at 04:04
  • @bravosierra99 Your link is unreadable on mobile (big ad with close button outside the screen). Maybe link to the official documentation or a relevant Stack Overflow question instead? – tripleee Feb 28 '19 at 04:27
  • @tripleee, you have the link to the official documentation in the answer ;) – Netwave Feb 28 '19 at 04:50
2

Here's a refactoring of the currently accepted- Chepner's deleted answer which only calls maketrans once.

tt = str.maketrans({"A":"T", "C":"G", "G":"C", "T": "A"})
for s1 in "AGACAT", "TAGGAC", "ACTAGAA":
    print(s1.translate(tt))

Perhaps also point out that you can chain the result from replace, though this is still clumsy and inefficient:

def acgtgca(s1):
    return s1.replace(
        "A", "\ue0fa").replace(
        "G", "\ue0fb").replace(
        "C", "G").replace(
        "T", "A").replace(
        "\ue0fb", "C").replace(
        "\ue0fa", "T")

This avoids using "aux" as a special marker in favor of two arbitrary characters out of the Unicode Private Use Area.

But again, the maketrans method is both neater and more efficient.

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • Maybe see also https://stackoverflow.com/questions/2484156/is-str-replace-replace-ad-nauseam-a-standard-idiom-in-python which has a nice alternative using `reduce` – tripleee Feb 28 '19 at 05:45