The website I am trying to extract data from is : http://www.genome.jp/dbget-bin/www_bget?ecs:ECs0037
and I am trying to extract the "nt sequence":
try:
geneSeq = browser.find_element_by_xpath("html/body/div[1]/table/tbody/tr/td/table[2]/tbody/tr/td[1]/form/table/tbody/tr/td/table/tbody/tr[11]/td").text
except:
geneSeq = "file\nnot found"
geneSeq = geneSeq[geneSeq.find('\n')+1:]
I remove the first line of the input as I don't need it but I have br tags within the code which are registered in the file but python does not see them. I have tried .isspace() and it returns false and therefore .rsplit() does not work. Unfortunately the lines still show up when i try to write the sequence to file using f.write.
Is there a way to remove the br tag?