0

i started learning python today and i had this project in mind where the code would generate a random nucleotides sequence. This is the code im using for now its its working perfectly fine

import random
import string
nuc = '*Chaine aleatoire de nucleotides (Brin transcrit)*     :'
nucBNT = '*Chaine aleatoire de nucleotides (Brin non transcrit)* :'
nucleotides_length = 30
possible_characters = "ATCG"
random_character_list = [random.choice(possible_characters) for i in range(nucleotides_length)]
random_nucleotides = "".join(random_character_list)
print(nuc, random_nucleotides)

I was then thinking about it generating the completing nucleotide sequence ( replacing Adenine by Thymine, Guanine by Cytosine, Thymine by Adenine and Cytosine by Guanine) but this is my first time opening Python and i dont know any way on how to do this despite looking on internet. Any help would be highly appreciated

Have a great day

1 Answers1

0

The simplest* way would be to set up a dictionary with an entry for each substitution. The keys (on the left) are the inputs, and the values (on the right) are what you want to replace each one with:

substitute = {
    "A": "T",
    "G": "C",
    "T": "A",
    "C": "G",
}

Then you can use this to replace each character in random_nucleotides, character-by-character:

completed_nucleotides = "".join([substitute[c] for c in random_nucleotides])

* for a certain value of "simplest", anyway


There are a number of general solutions out there for making multiple substitutions to a string using string.replace or re.sub. This answer maybe is the most complete list of them. However in this case the substitutions are cyclic(?): the As and Ts swap, and the Cs and Gs swap. If you replace characters across the whole string at once, you replace all the As with Ts, then all the Ts with As, and you're left with only As and no Ts.

You could remove the cyclic condition by having one string be lowercase. Then you could use one of the more common multiple-substitutions-in-a-string method.

Jack Deeth
  • 3,062
  • 3
  • 24
  • 39
  • Appreciated it a lot, but i still have a last question, what if i want to add an other sequence where T will be replaced by U for the ARN ? And again thanks you very much – Othman Errouich Feb 12 '22 at 11:16
  • No problem! You'd just make a different `substitute` dictionary which has `"T": "U"`. – Jack Deeth Feb 12 '22 at 11:18
  • substitute = { "T": "U", } arn_messager = "".join([substitute[c] for c in completed_nucleotides]) print(arnM, arn_messager) like this ? – Othman Errouich Feb 12 '22 at 11:26
  • That'd work, but if you're only making one kind of substitution you can use `arn_messager = completed_nucleotides.replace("T", "U")`. If you were making `arn_messager` from `random_nucleotides` you could make a second dictionary with 4 entries including `"T": "U"` - but you'd better give them both better names than `substitute`! – Jack Deeth Feb 12 '22 at 11:31