I'm helping my girlfriend with an assignment. Part of this assignment is the calculation of the number of possible combinations in a DNA sequence that contains certain wildcard characters.
I'm using the following python script:
from Bio import Seq
from Bio import SeqIO
short = SeqIO.parse('short.fasta', 'fasta')
long = SeqIO.parse('long.fasta', 'fasta')
# IUPAC dictionary
IUPAC = {
'R': 2,
'Y': 2,
'S': 2,
'W': 2,
'K': 2,
'M': 2,
'B': 3,
'D': 3,
'H': 3,
'V': 3,
'N': 4
}
# Define method to count number of possible sequences
def pos_seq(seqs):
d = {}
for record in seqs:
pos = 1
name = record.id
seq = record.seq
for ltr in seq:
if ltr in IUPAC.keys():
pos = pos * IUPAC[ltr]
d.update({name : pos})
print(name + ": " + str(pos) + " possibilities")
print("")
print("end of file")
print("")
return d
print(pos_seq(short))
print(pos_seq(long))
The function pos_seq
takes in a collection of sequences and returns the number of possibilities for each of sequence.
The script works fine and the function prints the correct answer on each iteration. However, I wanted to save the sequence's name and number of possibilities into a dictionary and return it.
Here is the problem: It always returns an empty dictionary (as defined at the start of the method).
Defining the dictionary globally (outside the function) works; the dictionary DOES get updated properly, so the problem may be that I defined the dictionary within the function. Maybe I need to change the .update
line to specify the dictionary I want to update isn't a global dictionary?
Long story to ask a simple question: I can't seem to use a function to create a dictionary, update it several times and then return it. It stays empty.
I'm not very experienced in Python, but I couldn't find an answer to this question online.
Thanks for the help!