2

Working out a few kinks in this code and for some reason my method validation check is not quite working. All I want it to do is to validate that the input from the user ONLY contains letters G, C, A, T before moving onto the method at_calculate which performs the maths on the input sequence. Any help/tips would be appreciated.

import re

from tkinter import *

class AT_content_calculator:

    def __init__(self, master):
        #initialising various widgets
        frame_1 = Frame(master)
        frame_1.pack()

        self.varoutput_1 = StringVar()

        self.label_1 = Label(frame_1, text="Please enter a DNA sequence:")
        self.label_1.pack()
        self.entry_1 = Entry(frame_1, textvariable=self.dna_sequence)
        self.entry_1.pack()
        self.output_1 = Label(frame_1, textvariable=self.varoutput_1)
        self.output_1.pack()
        self.button_1 = Button(frame_1, text="Calculate", command=self.validation_check)
        self.button_1.pack()

    def dna_sequence(self):
        self.dna_sequence = ()

    def validation_check(self):
        #used to validate that self.dna_sequence only contains letters G, C, A, T
        if re.match(r"GCAT", self.dna_sequence):
            self.at_calculate()
        else:
            self.varoutput_1.append = "Invalid DNA sequence. Please enter again."
            self.validation_check()

    def at_calculate(self):
        #used to calculate AT content of string stored in self.dna_sequence
        self.dna_sequence = self.entry_1.get()
        self.total_bases = len(self.dna_sequence)
        self.a_bases = self.dna_sequence.count("A")
        self.b_bases = self.dna_sequence.count("T")
        self.at_content = "%.2f" % ((self.a_bases + self.b_bases) / self.total_bases)
        self.varoutput_1.set("AT content percentage: " + self.at_content)

root = Tk()
root.title("AT content calculator")
root.geometry("320x320")
b = AT_content_calculator(root)
root.mainloop()

1 Answers1

2

If you want to validate the input from the user ONLY contains letters G, C, A, T you need to put the characters within a character class that will match any combinations of this characters :

Note :self.dna_sequence is a function and you can't pass it to match function although its incorrect.you need to return the input value within that function :

def dna_sequence(self):
     dna_sequence = self.entry_1.get()
     return dna_sequence

and then do :

if re.match(r"[GCAT]+", self.dna_sequence()):

[GCAT]+ will match any combinations of that characters with length 1 or more. if you want that be in length 4 you can use [GCAT]+{4}.

But this also will match duplicated characters. like GGCC.If you don't want such thing you can use set.intersection :

if len(self.dna_sequence())==4 and len(set(GCAT).intersection(self.dna_sequence()))==4:
      #do stuff

Or as a better way :

if sorted(self.dna_sequence)==ACGT:
      #do stuff
Mazdak
  • 105,000
  • 18
  • 159
  • 188
  • Thank you, I made a change to the code but when I enter a sequence and press calculate I get a TypeError: expected string or buffer. –  May 28 '15 at 14:27
  • @LuanSwanepoel The Error is relative to what part of your code exactly? – Mazdak May 28 '15 at 14:32
  • `code`(C:\Python34\python.exe C:/Users/Luan/Desktop/prog.py Exception in Tkinter callback Traceback (most recent call last): File "C:\Python34\lib\tkinter\__init__.py", line 1533, in __call__ return self.func(*args) File "C:/Users/Luan/Desktop/prog.py", line 28, in validation_check if re.match(r"[GCAT]+", self.dna_sequence): File "C:\Python34\lib\re.py", line 160, in match return _compile(pattern, flags).match(string) TypeError: expected string or buffer Process finished with exit code 0)`code` –  May 28 '15 at 14:33
  • 2
    @LuanSwanepoel and Kasra ...... "[GCAT]+" is correct answer ..... your other probem is `self.dna_sequence` .... is a function, therefore: "TypeError: expected string or buffer" ..... kasra, OP don't want length 4 ;) – Jose Ricardo Bustos M. May 28 '15 at 14:34
  • Hi there yes just to be clear the user can input any length of sequence. –  May 28 '15 at 14:37
  • `self.dna_sequence = self.entry_1.get()` before `match` and delete function `dna_sequence` – Jose Ricardo Bustos M. May 28 '15 at 14:37
  • @LuanSwanepoel It doesn't matter the error is because of that `self.dna_sequence` is not a string! checkout the edit – Mazdak May 28 '15 at 14:37
  • Kasra thank you for your help, is there any chance you could post the amended source code so I can see what you changed in full? –  May 28 '15 at 14:45
  • @LuanSwanepoel Welcome, as is said you just need to replace your `dna_sequence` function with mine that i posted in question. also with thank of Jose Ricardo Bustos M. – Mazdak May 28 '15 at 14:49
  • Hi there Kasra; I amended the code but now I get an error File "C:/Users/Luan/Desktop/prog.py", line 29, in validation_check if re.match(r"[GCAT]+", self.dna_sequence()): TypeError: 'str' object is not callable –  May 28 '15 at 15:01
  • @LuanSwanepoel ah... remove the `self` from leading of `dna_sequence` in function. – Mazdak May 28 '15 at 15:06
  • Hi there Kasra.. not sure what I am doing wrong here despite your good suggestions. If there is any chance you could post the code that is working for you that would be great. –  May 28 '15 at 15:39
  • @LuanSwanepoel I have not `tkinter` installed in my system. can you give me the error you get? – Mazdak May 28 '15 at 15:41
  • Hi there Kasra... I amended my code a little as re was giving me a headache. http://pastebin.com/6DnDAw3M Only problem now is when I enter something invalid like QZQGQG the validation_check just loops and loops; RuntimeError: maximum recursion depth exceeded. –  May 28 '15 at 16:18
  • @LuanSwanepoel I think its because of that you call your function in `else` with preceding wrong input! you need to pass the new input to your function or get the input again within your function. – Mazdak May 28 '15 at 20:08
  • If you find any problem again i think its better to ask a new question about your problem!in that way you'll get a better answer! – Mazdak May 28 '15 at 20:09